Merge branch 'Doc-why-use-exists-in-exports' into 'master'

Update documentation to explain use of EXISTS in SQL queries See merge request Chill-Projet/chill-bundles!664
2025-08-09 09:19:51 +00:00 · 2024-03-06 11:36:01 +00:00 · 2024-03-06 11:36:01 +00:00 · f1dbc17dad
commit f1dbc17dad
parent c888b5b84f 09578a775c
2 changed files with 130 additions and 0 deletions
--- a/.editorconfig
+++ b/.editorconfig
@ -23,3 +23,7 @@ max_line_length = 0
 indent_size = 2
 indent_style = space

+[.rst]
+ident_size = 3
+ident_style = space
+
--- a/docs/source/development/exports.rst
+++ b/docs/source/development/exports.rst
@ -242,3 +242,129 @@ This is an example of the *filter by birthdate*. This filter asks some informati
   Continue to explain the export framework

 .. _main bundle: https://git.framasoft.org/Chill-project/Chill-Main
+
+
+With many-to-* relationship, why should we set WHERE clauses in an EXISTS subquery instead of a JOIN ?
+``````````````````````````````````````````````````````````````````````````````````````````````````````
+
+As we described above, the doctrine builder is converted into a sql query. Let's see how to compute the "number of course
+which count at least one activity type with the id 7". For the purpose of this demonstration, we will restrict this on
+two accompanying period only: the ones with id 329 and 334.
+
+Let's see the list of activities associated with those accompanying period:
+
+.. code-block:: sql
+
+   SELECT id, accompanyingperiod_id, type_id FROM activity WHERE accompanyingperiod_id IN (329, 334) AND type_id = 7
+       ORDER BY accompanyingperiod_id;
+
+We see that we have 6 activities for the accompanying period with id 329, and only one for the 334's one.
+
+.. csv-table::
+   :header: id, accompany$iungperiod_id, type_id
+
+   990,329,7
+   986,329,7
+   987,329,7
+   993,329,7
+   991,329,7
+   992,329,7
+   1000,334,7
+
+Let's calculate the average duration for those accompanying periods, and the number of period:
+
+.. code-block:: sql
+
+   SELECT AVG(age(COALESCE(closingdate, CURRENT_DATE), openingdate)), COUNT(id) from chill_person_accompanying_period WHERE id IN (329, 334);
+
+The result of this query is:
+
+.. csv-table::
+   :header: AVG, COUNT
+
+   2 years 2 mons 21 days 12 hours 0 mins 0.0 secs,2
+
+Now, we count the number of accompanying period, adding a :code:`JOIN` clause which make a link to the :code:`activity` table, and add a :code:`WHERE` clause to keep
+only the accompanying period which contains the given activity type:
+
+.. code-block:: sql
+
+   SELECT COUNT(chill_person_accompanying_period.id) from chill_person_accompanying_period
+                 JOIN activity ON chill_person_accompanying_period.id = activity.accompanyingperiod_id
+                 WHERE chill_person_accompanying_period.id IN (329, 334) AND activity.type_id = 7;
+
+What are the results here ?
+
+.. csv-table::
+   :header: COUNT
+
+   7
+
+:code:`7` ! Why this result ? Because the number of lines is duplicated for each activity. Let's see the list of rows which
+are taken into account for the computation:
+
+.. code-block:: sql
+
+   SELECT chill_person_accompanying_period.id, activity.id from chill_person_accompanying_period
+   JOIN activity ON chill_person_accompanying_period.id = activity.accompanyingperiod_id
+   WHERE chill_person_accompanying_period.id IN (329, 334) AND activity.type_id = 7;
+
+.. csv-table::
+   :header: accompanyingperiod.id, activity.id
+
+   329,993
+   334,1000
+   329,987
+   329,990
+   329,991
+   329,992
+   329,986
+
+For each activity, a row is created and, as we count the number of non-null :code:`accompanyingperiod.id` columns, we
+count one entry for each activity (actually, we count the number of activities).
+
+So, let's use the :code:`DISTINCT` keyword to count only once the equal ids:
+
+.. code-block::
+
+   SELECT COUNT(DISTINCT chill_person_accompanying_period.id) from chill_person_accompanying_period
+   JOIN activity ON chill_person_accompanying_period.id = activity.accompanyingperiod_id
+   WHERE chill_person_accompanying_period.id IN (329, 334) AND activity.type_id = 7;
+
+Now, it works again...
+
+.. csv-table::
+   :header: COUNT
+
+   2
+
+But, for the average duration, this won't work: the duration which are equals (because the :code:`openingdate` is the same and
+:code:`closingdate` is still :code:`NULL`, for instance) will be counted only once, which will give unexpected result.
+
+The solution is to move the condition "having an activity with activity type with id 7" in a :code:`EXISTS` clause:
+
+.. code-block:: sql
+
+   SELECT COUNT(chill_person_accompanying_period.id) from chill_person_accompanying_period
+   WHERE chill_person_accompanying_period.id IN (329, 334) AND EXISTS (SELECT 1 FROM activity WHERE type_id = 7 AND accompanyingperiod_id = chill_person_accompanying_period.id);
+
+The result is correct without :code:`DISTINCT` keyword:
+
+.. csv-table::
+   :header: COUNT
+
+   2
+
+And we can now compute the average duration without fear:
+
+.. code-block:: sql
+
+  SELECT AVG(age(COALESCE(closingdate, CURRENT_DATE), openingdate)) from chill_person_accompanying_period
+  WHERE chill_person_accompanying_period.id IN (329, 334) AND EXISTS (SELECT 1 FROM activity WHERE type_id = 7 AND accompanyingperiod_id = chill_person_accompanying_period.id);
+
+Give the result:
+
+.. csv-table::
+   :header: AVG
+
+   2 years 2 mons 21 days 12 hours 0 mins 0.0 secs