This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
magik-demo:user:query_approximation:generalization [2013/07/23 14:36] alex |
magik-demo:user:query_approximation:generalization [2017/07/06 15:24] (current) |
||
---|---|---|---|
Line 1: | Line 1: | ||
+ | {{:magik-demo:magik_logo.png?nolink&500|}} | ||
+ | |||
====== Generalization ====== | ====== Generalization ====== | ||
Line 9: | Line 11: | ||
{{ :magik-demo:user:query_approximation:generalization.png?nolink&650 |}} | {{ :magik-demo:user:query_approximation:generalization.png?nolink&650 |}} | ||
- | This schema illustrates all the steps that the generalization algorithm does. In order to be more comprehensive, we will use and example (MUSIC COMPILATION). | + | This schema illustrates all the steps that the generalization algorithm does. In order to be more comprehensive, we will use and example. |
- | Given the follow schema about music compilation: | + | Suppose that: |
<code sql> | <code sql> | ||
- | Compilation(idComp,name,year,label,year) | + | Q = A1, A2, A3, A4 |
- | TrackContained(idComp,idTrack) | + | |
- | Track(idTrack,title,timing,idAuthor) | + | |
- | Author(idAuthor,surname,name,birthdate) | + | |
</code> | </code> | ||
- | + | where A1, A2, A3 and A4 are atoms and we have these TC-statements | |
- | Suppose we have this query: | + | |
<code sql> | <code sql> | ||
- | SELECT Author.name | + | TC1 - A1(a) = A1(i) |
- | FROM Compilation,TrackContained,Track,Author | + | TC2 - A2(a) = A2(i), A3(i) |
- | WHERE Compilation.idComp = TrackContained.idComp | + | TC3 - A3(a) = A3(i), A4(i) |
- | AND TrackContained.idTrack = Track.idTrack | + | |
- | AND Track.idAuthor = Author.idAuthor | + | |
- | AND Compilation.name = "I love rock and roll" | + | |
</code> | </code> | ||
+ | Running MAGIK on this query and given database schema and TC-statements, we get that this query is <fc red>NOT COMPLETE</fc> | ||
- | and we have this table completeness statements: | + | At this point the generalization algorithm enter in action.\\ |
+ | The idea is at each iteration remove all the incomplete atoms. The algorithm will stop if there is no other atoms that are incomplete (we said that it reaches a **fixed point**). | ||
- | <code sql> | + | ** __1st iteration__ ** |
- | TC1 = TrackContained(idComp,idTrack) WHERE Track(idTrack,title,timing,idAuthor) | + | |
- | TC2 = Track(idTrack,title,timing,idAuthor) WHERE Author(idAuthor,surname,name,birthdate) | + | Looking the 3 tc-statements that we have, we see that there is no information about A4 (you have to look on the left side of the equal) so, the algorithm will remove from the query the atom A4. \\ |
- | TC3 = Author(idAuthor,surname,name,birthdate) | + | New query is |
+ | <code> | ||
+ | Q1 = A1, A2, A3 | ||
</code> | </code> | ||
+ | Obviously Q1 is not equal to Q, so the algorithm will continue. | ||
- | Running MAGIK on this query and given database schema and TC-statements, we get that this query is <fc red>NOT COMPLETE</fc> | + | ** __2nd iteration__ ** |
- | At this point the generalization algorithm enter in action.// | + | Now we know that A4 is incomplete. From this we can inferred that also A3 is not complete because A3 in complete iff also A4 is complete (TC3), but this is not the case. So A3 will be removed from the query.\\ |
- | The idea is at each iteration remove all the incomplete atoms. The algorithm will stop if there is no other atoms that are incomplete (we said that it reaches a **fixed point**). | + | New query is |
+ | <code> | ||
+ | Q2 = A1, A2 | ||
+ | </code> | ||
+ | Obviously Q2 is not equal to Q1, so the algorithm will continue. | ||
+ | |||
+ | ** __3rd iteration__ ** | ||
+ | |||
+ | Now we know that A4 and A3 are incomplete. From this we can inferred that also A2 is not complete because A2 is complete iff also A3 is complete (TC2), but this is not the case. For this reason A2 will be removed from the query.\\ | ||
+ | New query is | ||
+ | <code> | ||
+ | Q3 = A1 | ||
+ | </code> | ||
+ | Obviously Q3 is not equal to Q2, so the algorithm will start another iteration | ||
+ | |||
+ | ** __4th iteration__ ** | ||
+ | |||
+ | Now we have only A1. This atom is complete because there is no other atoms involved in its completeness (TC1). So A1 survived.\\ | ||
+ | New query is | ||
+ | <code> | ||
+ | Q4 = A1 | ||
+ | </code> | ||
+ | Now Q4 is **equal** to Q3, so we reach a **fix point** and the algorithm stops. | ||
+ | |||
+ | So the generalizated query Qgen is: | ||
+ | <code> | ||
+ | Qgen = A1 | ||
+ | </code> |