Explanation
The Basics
SimQLe is used to execute SQL and return data from relational databases, for example SQLite, MySQL, PostgreSQL and SQL Server.
SimQLe is designed to use multiple connections simultaneously, as a lot of projects use more than one database. For example a MySQL backend with a SQLite cache on a webserver.
Each database application has it's own SQL syntax. The one thing they have in common is you can execute SQL commands on them to execute actions or return data. SimQLe uses this genericness to make sure it works on all databases. By remaining database agnostic, and just sending the user-define SQL statements, all current and future relational databases (that use this generic approach) can be used.
SimQLe uses database connections defined in a yaml file called
.connections.yaml
that is at the root of the project, in your home folder, or
even loaded as a dict
if you don't want to have a file.
The Rational
SimQLe was designed to make it easy to:
- Integrate with many databases simultaneously
- Make it easy to write code that can be used in integration tests
- Have dead simple and familiar return types.
- Allow for named parameters in all database connections
Why not just use environment variables with other libraries?
Environment variables are the obvious and useful way to define database connections
with other python SQL libraries. However, housing them in a .connections.yaml
has a few advantages over that:
1) Managing Environment Variables for multiple database connections are a mess
and have no standard naming convention. DB_CONN_1
, DB_CONN_2
etc?
2) This is exacerbated by mirrored dev and test connections.
3) It makes the project easier to share. Imagine if you could share some code,
and just say "This code requires a MySQL connection called 'main',
and a SQLite connection called 'cache'". You then spin up a MySQL container and
choose the location of your sqlite database and boom, you know this code will work
on your sharee's computer.
4) You could even share with a .connections.yaml.template
file
5) Less thought goes into the naming of variables to remember which database is
which.
6) Each database uses the same SimQLe methods and functions.
7) Changing just one environment variable (SIMQLE_MODE
) is quicker.
8) Keeping a .connections.yaml
file in your home directory allows for rapid
python scripting on any system without having to worry about getting the
connection string correct each time. It also means you can change the location
of a system database that all your scripts point to in one quick change.
9) Easier to read when coming across a new project.
Testing applications with database connections
This seems to be an age-old quandary and the source of quite a few articles online. In my view, when running code for integration testing or even in development environments when seeing if the code runs as expected while actually creating it, I don't think the code itself should be changed in any way in order to be tested.
Even having code that is designed in a way to be easy to have changing database connections (depending on the environment) sounds like a poor design choice. Instead, code should be written 100% for production, and then the libraries themselves should be able to switch which databases the code is connected to.
The environment variable SIMQLE_MODE
has three options (Similar to Flask):
- production
- development
- testing
### Why are development and testing modes required?
Development mode my connect to an existing development database filled with example data. An example would be a boss walking past your desk and asking "Show me what this can do.". Spin up the project, connect to the example development database, and show off what an example process would look like.
Testing mode should ideally use a brand new database spun up from scratch. Tests should start from a blank slate and shouldn't worry about any existing data.
## The ConnectionManager
class
This is the class that should be defined and intialised once per project, pointing
to a .connections.yaml
file.
cm = ConnectionManager(".connections.yaml")
cm
can now be used, either by cm.recordset()
, or cm.execute_sql()
The Recordset class
Data is returned into a class called RecordSet. This has the 2 main attributes
of Recordset.headings
giving a tuple of the headings returned by the query,
and Recordset.data
giving a list of tuples of the returned data. Several other
methods exist for convenience, such as Recordset.column[]
which will return a
list of data in the particular column. Recordset.as_dict()
returns a list
of dicts of the data.
Executing SQL vs getting data from the database.
These are actually the same thing, except executing SQL statements doesn't expect any returning data.
Recordset vs Record and RecordScalar
ConnectionManager.recordset()
can be the only method you use, however 2 others
exist for convenience and efficiency, record
and record_scalar
.
Simply put, ConnectionManager.record()
assumes a single record is returned,
and any others are discarded. A classic example is selecting a row where the
unique primary key equals a certain value.
Likewise ConnectionManager.record_scalar
assumes only a single data point is
being returned. This datapoint is access through the .datum
attribute, for
example:
from simqle import ConnectionManager
cm = ConnectionManager()
sql = "select age from Person where PersonID = :person_id"
params = {"person_id": 5}
my_scalar = cm.record_scalar(sql=sql, params=params)
datapoint = my_scalar.datum
Giving:
>>> datapoint
35
>>> type(my_scalar)
<type RecordScalar>
>>> bool(my_scalar)
True
The truthiness of the RecordScalar class is the same as the truthiness of
the datum. Note, there is a big different between a RecordScalar returning a value,
returning None, or not returning a row (perhaps 0 records exist?). If 0 records
exist, then SimQLe will throw a NoScalarDataError
. This should be caught in a
try: except:
pattern, for example:
try:
my_scalar = cm.record_scalar(sql=sql, params=params)
if my_scalar:
print(datapoint equates to true)
else:
print(datapoint equates to false, like '', 0 or NULL)
except NoScalarDataError:
print(no data was returned)