mirror of
https://gitlab.com/Chill-Projet/chill-bundles.git
synced 2025-06-07 18:44:08 +00:00
Update documentation to explain use of EXISTS in SQL queries
Added an explanatory section to the "exports.rst" doc to clarify why to use an EXISTS subquery instead of a JOIN clause in SQL queries involving many-to-* relationships. This explanation includes sample SQL queries and results to illustrate the potential issue of duplicates with JOIN and count, and how EXISTS can help avoid this issue. Also updated the ".editorconfig" file for .rst files.
This commit is contained in:
parent
c888b5b84f
commit
09578a775c
@ -23,3 +23,7 @@ max_line_length = 0
|
|||||||
indent_size = 2
|
indent_size = 2
|
||||||
indent_style = space
|
indent_style = space
|
||||||
|
|
||||||
|
[.rst]
|
||||||
|
ident_size = 3
|
||||||
|
ident_style = space
|
||||||
|
|
||||||
|
@ -242,3 +242,129 @@ This is an example of the *filter by birthdate*. This filter asks some informati
|
|||||||
Continue to explain the export framework
|
Continue to explain the export framework
|
||||||
|
|
||||||
.. _main bundle: https://git.framasoft.org/Chill-project/Chill-Main
|
.. _main bundle: https://git.framasoft.org/Chill-project/Chill-Main
|
||||||
|
|
||||||
|
|
||||||
|
With many-to-* relationship, why should we set WHERE clauses in an EXISTS subquery instead of a JOIN ?
|
||||||
|
``````````````````````````````````````````````````````````````````````````````````````````````````````
|
||||||
|
|
||||||
|
As we described above, the doctrine builder is converted into a sql query. Let's see how to compute the "number of course
|
||||||
|
which count at least one activity type with the id 7". For the purpose of this demonstration, we will restrict this on
|
||||||
|
two accompanying period only: the ones with id 329 and 334.
|
||||||
|
|
||||||
|
Let's see the list of activities associated with those accompanying period:
|
||||||
|
|
||||||
|
.. code-block:: sql
|
||||||
|
|
||||||
|
SELECT id, accompanyingperiod_id, type_id FROM activity WHERE accompanyingperiod_id IN (329, 334) AND type_id = 7
|
||||||
|
ORDER BY accompanyingperiod_id;
|
||||||
|
|
||||||
|
We see that we have 6 activities for the accompanying period with id 329, and only one for the 334's one.
|
||||||
|
|
||||||
|
.. csv-table::
|
||||||
|
:header: id, accompany$iungperiod_id, type_id
|
||||||
|
|
||||||
|
990,329,7
|
||||||
|
986,329,7
|
||||||
|
987,329,7
|
||||||
|
993,329,7
|
||||||
|
991,329,7
|
||||||
|
992,329,7
|
||||||
|
1000,334,7
|
||||||
|
|
||||||
|
Let's calculate the average duration for those accompanying periods, and the number of period:
|
||||||
|
|
||||||
|
.. code-block:: sql
|
||||||
|
|
||||||
|
SELECT AVG(age(COALESCE(closingdate, CURRENT_DATE), openingdate)), COUNT(id) from chill_person_accompanying_period WHERE id IN (329, 334);
|
||||||
|
|
||||||
|
The result of this query is:
|
||||||
|
|
||||||
|
.. csv-table::
|
||||||
|
:header: AVG, COUNT
|
||||||
|
|
||||||
|
2 years 2 mons 21 days 12 hours 0 mins 0.0 secs,2
|
||||||
|
|
||||||
|
Now, we count the number of accompanying period, adding a :code:`JOIN` clause which make a link to the :code:`activity` table, and add a :code:`WHERE` clause to keep
|
||||||
|
only the accompanying period which contains the given activity type:
|
||||||
|
|
||||||
|
.. code-block:: sql
|
||||||
|
|
||||||
|
SELECT COUNT(chill_person_accompanying_period.id) from chill_person_accompanying_period
|
||||||
|
JOIN activity ON chill_person_accompanying_period.id = activity.accompanyingperiod_id
|
||||||
|
WHERE chill_person_accompanying_period.id IN (329, 334) AND activity.type_id = 7;
|
||||||
|
|
||||||
|
What are the results here ?
|
||||||
|
|
||||||
|
.. csv-table::
|
||||||
|
:header: COUNT
|
||||||
|
|
||||||
|
7
|
||||||
|
|
||||||
|
:code:`7` ! Why this result ? Because the number of lines is duplicated for each activity. Let's see the list of rows which
|
||||||
|
are taken into account for the computation:
|
||||||
|
|
||||||
|
.. code-block:: sql
|
||||||
|
|
||||||
|
SELECT chill_person_accompanying_period.id, activity.id from chill_person_accompanying_period
|
||||||
|
JOIN activity ON chill_person_accompanying_period.id = activity.accompanyingperiod_id
|
||||||
|
WHERE chill_person_accompanying_period.id IN (329, 334) AND activity.type_id = 7;
|
||||||
|
|
||||||
|
.. csv-table::
|
||||||
|
:header: accompanyingperiod.id, activity.id
|
||||||
|
|
||||||
|
329,993
|
||||||
|
334,1000
|
||||||
|
329,987
|
||||||
|
329,990
|
||||||
|
329,991
|
||||||
|
329,992
|
||||||
|
329,986
|
||||||
|
|
||||||
|
For each activity, a row is created and, as we count the number of non-null :code:`accompanyingperiod.id` columns, we
|
||||||
|
count one entry for each activity (actually, we count the number of activities).
|
||||||
|
|
||||||
|
So, let's use the :code:`DISTINCT` keyword to count only once the equal ids:
|
||||||
|
|
||||||
|
.. code-block::
|
||||||
|
|
||||||
|
SELECT COUNT(DISTINCT chill_person_accompanying_period.id) from chill_person_accompanying_period
|
||||||
|
JOIN activity ON chill_person_accompanying_period.id = activity.accompanyingperiod_id
|
||||||
|
WHERE chill_person_accompanying_period.id IN (329, 334) AND activity.type_id = 7;
|
||||||
|
|
||||||
|
Now, it works again...
|
||||||
|
|
||||||
|
.. csv-table::
|
||||||
|
:header: COUNT
|
||||||
|
|
||||||
|
2
|
||||||
|
|
||||||
|
But, for the average duration, this won't work: the duration which are equals (because the :code:`openingdate` is the same and
|
||||||
|
:code:`closingdate` is still :code:`NULL`, for instance) will be counted only once, which will give unexpected result.
|
||||||
|
|
||||||
|
The solution is to move the condition "having an activity with activity type with id 7" in a :code:`EXISTS` clause:
|
||||||
|
|
||||||
|
.. code-block:: sql
|
||||||
|
|
||||||
|
SELECT COUNT(chill_person_accompanying_period.id) from chill_person_accompanying_period
|
||||||
|
WHERE chill_person_accompanying_period.id IN (329, 334) AND EXISTS (SELECT 1 FROM activity WHERE type_id = 7 AND accompanyingperiod_id = chill_person_accompanying_period.id);
|
||||||
|
|
||||||
|
The result is correct without :code:`DISTINCT` keyword:
|
||||||
|
|
||||||
|
.. csv-table::
|
||||||
|
:header: COUNT
|
||||||
|
|
||||||
|
2
|
||||||
|
|
||||||
|
And we can now compute the average duration without fear:
|
||||||
|
|
||||||
|
.. code-block:: sql
|
||||||
|
|
||||||
|
SELECT AVG(age(COALESCE(closingdate, CURRENT_DATE), openingdate)) from chill_person_accompanying_period
|
||||||
|
WHERE chill_person_accompanying_period.id IN (329, 334) AND EXISTS (SELECT 1 FROM activity WHERE type_id = 7 AND accompanyingperiod_id = chill_person_accompanying_period.id);
|
||||||
|
|
||||||
|
Give the result:
|
||||||
|
|
||||||
|
.. csv-table::
|
||||||
|
:header: AVG
|
||||||
|
|
||||||
|
2 years 2 mons 21 days 12 hours 0 mins 0.0 secs
|
||||||
|
Loading…
x
Reference in New Issue
Block a user