Good day.
Today I want to talk about ORM SQLAlchemy. Let's talk about what it is about its capabilities and flexibility, and also consider cases that are not always clearly described.
This ORM has a threshold of entry above the average, so I will try to explain everything in simple language and with examples. This article will be useful to those who already work with sqlalchemy and want to upgrade their skills or just get acquainted with this library.
The programming language used is python 3.6.
DB - PostgreSQL.
Github link
So what is ORM?
ORM (Object-Relational Mapping) is a technology that allows you to map models whose types are incompatible. For example: a database table and a programming language object.
In other words, you can access class objects to manage data in database tables. You can also create, modify, delete, filter, and, most importantly, inherit class objects mapped to database tables, which significantly reduces the content of the code base.
To use SQLAlchemy, you need to understand how it works.
Developers who use Django-ORM will have to rebuild their mindset a bit to create ORM queries. In my opinion, SQLAlchemy is a functional monster whose capabilities can and should be used, but you need to understand that ORMs are not always perfect. Therefore, we will discuss moments when the use of this technology is advisable.
SQLAlchemy has the concept of declarative and non-declarative model definitions.
Non-declarative definitions imply the use of mapper (), which describes the mapping of each database column and model class.
This article uses a declarative definition of models.
More here
For complete data consistency, let's create the following tables.
The basic model is used to determine the basic columns in the database.
class BaseModel(Base): __abstract__ = True id = Column(Integer, nullable=False, unique=True, primary_key=True, autoincrement=True) created_at = Column(TIMESTAMP, nullable=False) updated_at = Column(TIMESTAMP, nullable=False) def __repr__(self): return "<{0.__class__.__name__}(id={0.id!r})>".format(self)
Employee - a table describing the employee who works in the office
class Employee(BaseModel): __tablename__ = 'employees' first_name = Column(VARCHAR(255), nullable=False) last_name = Column(VARCHAR(255), nullable=False) phone = Column(VARCHAR(255), unique=True, nullable=True) description = Column(VARCHAR(255), nullable=True)
EmployeeWithSkills is not a table. The class inherited from Employee. A great opportunity to separate the logic and use the class as if it were a separate table.
class EmployeeWithSkills(Employee): skills = relation(Skill, secondary=EmployeesSkills.__tablename__, lazy='joined')
Department - the department in which this employee works. A person can consist of several departments.
class Department(BaseModel): __tablename__ = 'departments' name = Column(VARCHAR(255), nullable=False) description = Column(VARCHAR(255), nullable=False)
The correspondence table of the employee and the units in which he is a member.
class EmployeeDepartments(BaseModel): __tablename__ = 'employee_departments' employee_id = Column(Integer, ForeignKey('employees.id', ondelete='CASCADE'), nullable=False, index=True) department_id = Column(Integer, ForeignKey('departments.id', ondelete='CASCADE'), nullable=False, index=True)
Correspondence table of employees and their skills.
class EmployeesSkills(BaseModel): __tablename__ = 'employees_skills' employee_id = Column(ForeignKey('employee.id', ondelete='CASCADE'), nullable=False, index=True) skill_id = Column(ForeignKey('skills.id', ondelete='CASCADE'), nullable=False, index=True)
We create migrations using the alembic package, which allows you to generate them automatically. In this lesson, auto-generation of migrations is perfectly acceptable.
The latest migration contains test data that will fill the database.
How to configure alembic can be read here
We carry out the treasured alembic upgrade head to complete the migration.
Let's make the first request and get information about the employee by his id.
The request will look like this:
lesson1:
employee = session.query(Employee).filter(Employee.id == eid).one() output: ID: 2, Tony Stark
.one()
at the end means that we intend to get only one entry. If there are several entries, an appropriate exception will be raised.
If we want to get all the departments available, then we can use the following query using .all()
lesson2:
emmployee = session.query(Department).all() output: ID: 2, name: Guards ID: 4, name: Legions
Consider working with aggregation functions.
We can get the number of departments available using the built-in function.
.count()
or use func.count()
. Using the second method, you can access any SQL functions using either select
or to calculate intermediate results.
lesson3:
def get_departments_count(session: DBSession) -> int: count = session.query(Department).count() return count def get_departments_func_count(session: DBSession) -> int: count = session.query(func.count(Department.id)).scalar() return count
Many developers use the count()
function to check for data in a request. This is not a good practice, which leads to the use of additional database resources and an increase in query execution time. A good solution would be to use the exists()
function that returns a scalar value:
lesson3:
def check_department_exists(session: DBSession, department_name: str) -> bool: is_exists = session.query(exists().where(Department.name == department_name)).scalar() return is_exists
Moving on, we complicate the task and get acquainted with the entity relation
or relationship
. The fact is that in SQLAlchemy
in addition to using foreign_key
at the database level, relationships between objects are also used.
Thus, we can get the database row dependent on the foreign key in the object.
These objects are a projection on the database tables, interconnected.
Relations
in SQLAlchemy
has a flexible configuration, allowing you to get data from the database in different ways at different times using the named argument lazy
.
The main degrees of "laziness":
select
dynamic
joined
subquery
The default is select
.
Filtering in queries can be static and dynamic. Dynamic filtering allows filling the request with filters, which may vary depending on the progress of the function.
lesson4:
def dynamic_filter(session: DBSession, filter: DFilter = None): query = session.query(Employee) if filter is not None: query = query.filter(*filter.conds) employees = query.all() return employees
The DFilter filter class defines filters based on any input. If the filter class is defined, but further in the request conditions apply.
The .filter () function accepts accepts the SQLAlchemy binary conditions, so it can be represented using *
The use of dynamic filters is limited only by imagination. The result of the query shows which heroes are currently inactive.
output: Inactive_heros: Name: Tony Stark Name: Scott Lang Name: Peter Parker
I suggest working with the many-to-many relationship.
We have an Employee table in which there is a relation to the EmployeesSkills correspondence table. It contains foreign_key on the employee table and foreign_key
to the skill table.
lesson 5:
def get_employee_with_skills(session: DBSession, eid: int): employee = session.query(EmployeeWithSkills).filter(EmployeeWithSkills.id == eid).one() return employee output: Employee Tony Stark has skills: Skill: Fly, Desc: I belive I can Fly. I belive I can touch the sky Skill: Light Shield, Desc: Light protect. Perfect for everything
Using the EmployeeWithSkills class in the query above, we refer to it as a database table, but in reality such a table does not exist. This class is different from Employee having a relation, which has a many-to-many relationship. So we can differentiate the logic of the classes, filling it with a different set of relations. As a result of the request, we will see the skills of one of the employees.
Since the employee can be in several departments, create a relation that allows you to get this information.
Create an EmployeeWithDepartments class inherited from Employee and add the following:
class EmployeeWithDepartments(Employee): departments = relation( Department, # primaryjoin=EmployeeDepartments.employee_id == Employee.id, secondary=EmployeeDepartments.__tablename__, # secondaryjoin=EmployeeDepartments.department_id == Department.id, )
The created class is not a new database table. This is still the same Employee table, only expanded using relation
. This way you can access the Employee
or EmployeeWithDepartments
table in queries. The difference will only be in the absence / presence of relation
.
The first argument indicates to which table we will create the relation
.
primaryjoin
is the condition by which the second table will be connected before it is attached to the object.
secondary
is the name of the table containing foreign_keys to match. Used in the case of many-to-many.
secondaryjoin
- conditions for matching the intermediate table with the last.
primaryjoin
and secondaryjoin
serve to explicitly indicate correspondences in complex situations.
Sometimes situations arise when it is necessary to create filters whose fields are declared in relations, and relations in turn are relations of the original class.
EmployeeWithCadreMovements -> relation(CadreMovement) -> field
If the relation displays a list of values, then you need to use .any (), if there is only one value, then you need to use .has ()
For a better understanding, this construct will be interpreted in the SQL language into the exists () construct.
We call the get function with the reason parameter parameter, for example, simple
.
lesson6
def has_in_relations(session: DBSession, reason: str): employees = session.query(EmployeeWithCadreMovements).filter(EmployeeWithCadreMovements.cadre_movements.any(CadreMovement.reason == reason)).all() return employees output: [Steve Rogers, Tony Stark]
lession7
Consider the possibility of obtaining a relation using the aggregation function. For example, we get the last personnel movement of a specific user.
primaryjoin is a condition for joining tables (in the case of using lazy = 'joined'). Recall that select is used by default.
In this case, a separate request is generated when accessing the class attribute. It is for this request that we can specify the filtering conditions.
As you know, you cannot use aggregation functions in a βpureβ form in a WHERE condition, so we can implement this feature by specifying relation
with the following parameters:
last_cadre_movement = relation( CadreMovement, primaryjoin=and_( CadreMovement.employee == Employee.id, uselist=False, CadreMovement.id == select([func.max(CadreMovement.id)]).where(CadreMovement.employee == Employee.id) ) )
When executed, the request compiles like this:
SELECT cadre_movements.id AS cadre_movements_id, cadre_movements.created_at AS cadre_movements_created_at, cadre_movements.updated_at AS cadre_movements_updated_at, cadre_movements.employee AS cadre_movements_employee, cadre_movements.old_department AS cadre_movements_old_department, cadre_movements.new_department AS cadre_movements_new_department, cadre_movements.reason AS cadre_movements_reason FROM cadre_movements WHERE cadre_movements.employee = %(param_1)s AND cadre_movements.id = ( SELECT max(cadre_movements.id) AS max_1 FROM cadre_movements WHERE cadre_movements.employee = %(param_1)s )
SQLAlchemy is a powerful query building tool that reduces development time by supporting inheritance.
But you should keep a fine line between using ORM and writing complex queries. In some cases, ORM can confuse the developer or make the code cumbersome and unreadable.
Good luck