TRIAD - Supplier Part Catalogue - Mapping of Data Resource on Data Model

Create a Triad Framework object

In [1]:
from TRIAD.CacheTriad import TriadFramework

triad=TriadFramework('_SYSTEM', 'SYS', 1)
NetworkX Hypergraph Interface : 
Enhanced drawing and processing routines for NetworkX graph package
(C) 2017 HEALIS.EU - Athanassios I. Hatzis, All Rights Reserved

CachePython Inteface : 
Enhanced OOP Cache-Python interface based on Intersystems Cache python binding modules
(C) 2017 HEALIS.EU - Athanassios I. Hatzis, All Rights Reserved

CacheHypergraph Interface : 
Python Binding of Intersystems Cache Associative Semiotic Hypegraph engine
(C) 2017 HEALIS.EU - Athanassios I. Hatzis, All Rights Reserved

CacheTriad Interface : 
A Python client for Triad Associative Semiotic Hypergraph framework based on previous modules
(C) 2017 HEALIS.EU - Athanassios I. Hatzis, All Rights Reserved

Get Data Resources

In [2]:
resdf = triad.get(what='Data Resources', con=0, itm=0)
resdf
Out[2]:
KEY CNT NAM ABBR MKEY MNAM MOFMKEY MOFMNAM TYPE DICTYPE PATH
0 33||0||0||0 1 DATA RESOURCES ENVIRONMENT DRE None None None None DICT ENV None
1 33||1024||0||0 0 SUPPLIER PART CATALOGUE Data Resource SPC_DR None None None None DICT SYS /home/athan/Work/TRIADemo/Data/SupplierPartCat...

Map Columns on Attributes

Notice: The data type of each column must match the data type of its mapped attribute. Those data types that are currently supported in TRIADB FRAMEWORK are:

| Text,        | TXT  |(Short passages of text in any language)
| Word,        | WRD  |(Words or small phrases up to 128 characters long)
| Real,        | REA  |(Number)
| Integer,     | INT  |(Number)
| TimeStamp,   | TS   |(Date and Time)
| Code,        | COD  |(Any non-readable string up to 128 characters long)
| URL, PATH,   | LNK  |(URL address, or PATH up to 2048 characters long)
| Binary,      | RAW  |(Binary sequence up '1048487' bytes in size)
| Categorical, | ENUM |(Enumerated string up to 128 characters long)

Left Side - Columns

In [3]:
ldf = triad.get(what='Data Resources', sys=1024, con=20)
ldf.iloc[1:]
Out[3]:
KEY CNT NAM ABBR MKEY MNAM MOFMKEY MOFMNAM TYPE DICTYPE PATH
1 33||1024||20||1 0 catsid_col catsid_col None None None None DICT TYPE None
2 33||1024||20||2 0 catpid_col catpid_col None None None None DICT TYPE None
3 33||1024||20||3 0 catcost_col catcost_col None None None None DICT TYPE None
4 33||1024||20||4 0 catqnt_col catqnt_col None None None None DICT TYPE None
5 33||1024||20||5 0 catdate_col catdate_col None None None None DICT TYPE None
6 33||1024||20||6 0 catchk_col catchk_col None None None None DICT TYPE None
7 33||1024||20||7 0 pid_col pid_col None None None None DICT TYPE None
8 33||1024||20||8 0 pname_col pname_col None None None None DICT TYPE None
9 33||1024||20||9 0 pcolor_col pcolor_col None None None None DICT TYPE None
10 33||1024||20||10 0 pweight_col pweight_col None None None None DICT TYPE None
11 33||1024||20||11 0 punit_col punit_col None None None None DICT TYPE None
12 33||1024||20||12 0 sid_col sid_col None None None None DICT TYPE None
13 33||1024||20||13 0 sname_col sname_col None None None None DICT TYPE None
14 33||1024||20||14 0 saddress_col saddress_col None None None None DICT TYPE None
15 33||1024||20||15 0 scity_col scity_col None None None None DICT TYPE None
16 33||1024||20||16 0 scountry_col scountry_col None None None None DICT TYPE None
17 33||1024||20||17 0 sstatus_col sstatus_col None None None None DICT TYPE None

Right Side - Attributes

In [4]:
rdf = triad.get(what='DME', sys=1111, con=20)
rdf.iloc[1:]
Out[4]:
KEY CNT NAM ABBR MKEY MNAM MOFMKEY MOFMNAM TYPE DICTYPE PATH
1 44||1111||20||1 0 supID id 55||1032||40||0 INT None None DICT TYPE None
2 44||1111||20||2 0 supName nameEN 55||1032||50||0 WRD None None DICT TYPE None
3 44||1111||20||3 0 supAddress address 55||1032||60||0 TXT None None DICT TYPE None
4 44||1111||20||4 0 supCity city 55||1032||70||0 WRD None None DICT TYPE None
5 44||1111||20||5 0 supCountry country 55||1032||80||0 WRD None None DICT TYPE None
6 44||1111||20||6 0 supStatus status 55||1032||90||0 INT None None DICT TYPE None
7 44||1111||20||7 0 prtID id 55||1032||100||0 INT None None DICT TYPE None
8 44||1111||20||8 0 prtName nameEN 55||1032||110||0 WRD None None DICT TYPE None
9 44||1111||20||9 0 prtColor color 55||1032||120||0 WRD None None DICT TYPE None
10 44||1111||20||10 0 prtWeight weight 55||1032||130||0 REA None None DICT TYPE None
11 44||1111||20||11 0 prtUnit unit 55||1032||140||0 ENUM None None DICT TYPE None
12 44||1111||20||12 0 catPrice price 55||1032||150||0 REA None None DICT TYPE None
13 44||1111||20||13 0 catTotal total 55||1032||160||0 INT None None DICT TYPE None
14 44||1111||20||14 0 catDate date 55||1032||170||0 TS None None DICT TYPE None
15 44||1111||20||15 0 catCheck check 55||1032||180||0 ENUM None None DICT TYPE None

Draw Bipartite Sets

In [5]:
graph=triad.set_mapping(left_df=ldf.iloc[1:], right_df=rdf.iloc[1:], left_heading='Columns', right_heading='Attributes')
graph.set_dpi(65) ; graph.draw()
Out[5]:
G cluster_Left Columns cluster_Right Attributes 33||1024||20||1 catsid_col 33||1024||20||2 catpid_col 33||1024||20||1->33||1024||20||2 33||1024||20||3 catcost_col 33||1024||20||2->33||1024||20||3 33||1024||20||4 catqnt_col 33||1024||20||3->33||1024||20||4 33||1024||20||5 catdate_col 33||1024||20||4->33||1024||20||5 33||1024||20||6 catchk_col 33||1024||20||5->33||1024||20||6 33||1024||20||7 pid_col 33||1024||20||6->33||1024||20||7 33||1024||20||8 pname_col 33||1024||20||7->33||1024||20||8 33||1024||20||9 pcolor_col 33||1024||20||8->33||1024||20||9 33||1024||20||10 pweight_col 33||1024||20||9->33||1024||20||10 33||1024||20||11 punit_col 33||1024||20||10->33||1024||20||11 33||1024||20||12 sid_col 33||1024||20||11->33||1024||20||12 33||1024||20||13 sname_col 33||1024||20||12->33||1024||20||13 33||1024||20||14 saddress_col 33||1024||20||13->33||1024||20||14 33||1024||20||15 scity_col 33||1024||20||14->33||1024||20||15 33||1024||20||16 scountry_col 33||1024||20||15->33||1024||20||16 33||1024||20||17 sstatus_col 33||1024||20||16->33||1024||20||17 44||1111||20||1 supID 44||1111||20||2 supName 44||1111||20||1->44||1111||20||2 44||1111||20||3 supAddress 44||1111||20||2->44||1111||20||3 44||1111||20||4 supCity 44||1111||20||3->44||1111||20||4 44||1111||20||5 supCountry 44||1111||20||4->44||1111||20||5 44||1111||20||6 supStatus 44||1111||20||5->44||1111||20||6 44||1111||20||7 prtID 44||1111||20||6->44||1111||20||7 44||1111||20||8 prtName 44||1111||20||7->44||1111||20||8 44||1111||20||9 prtColor 44||1111||20||8->44||1111||20||9 44||1111||20||10 prtWeight 44||1111||20||9->44||1111||20||10 44||1111||20||11 prtUnit 44||1111||20||10->44||1111||20||11 44||1111||20||12 catPrice 44||1111||20||11->44||1111||20||12 44||1111||20||13 catTotal 44||1111||20||12->44||1111||20||13 44||1111||20||14 catDate 44||1111||20||13->44||1111||20||14 44||1111||20||15 catCheck 44||1111||20||14->44||1111||20||15

Add Mapping

In [6]:
matching_pairs = [
 ('catsid_col', 'supID'),
 ('catpid_col', 'prtID'),
 ('catcost_col', 'catPrice'),
 ('catqnt_col', 'catTotal'),
 ('catdate_col', 'catDate'),
 ('catchk_col', 'catCheck'),
 ('pid_col', 'prtID'),
 ('pname_col', 'prtName'),
 ('pcolor_col', 'prtColor'),
 ('pweight_col', 'prtWeight'),
 ('punit_col', 'prtUnit'),
 ('sid_col', 'supID'),
 ('sname_col', 'supName'),
 ('saddress_col', 'supAddress'),
 ('scity_col', 'supCity'),
 ('scountry_col', 'supCountry'),
 ('sstatus_col', 'supStatus') ]
In [7]:
triad.add(what='Mapping Data Resource', label_pairs=matching_pairs, pydot_graph=graph)
Out[7]:
[('33||1024||20||1', '44||1111||20||1'),
 ('33||1024||20||2', '44||1111||20||7'),
 ('33||1024||20||3', '44||1111||20||12'),
 ('33||1024||20||4', '44||1111||20||13'),
 ('33||1024||20||5', '44||1111||20||14'),
 ('33||1024||20||6', '44||1111||20||15'),
 ('33||1024||20||7', '44||1111||20||7'),
 ('33||1024||20||8', '44||1111||20||8'),
 ('33||1024||20||9', '44||1111||20||9'),
 ('33||1024||20||10', '44||1111||20||10'),
 ('33||1024||20||11', '44||1111||20||11'),
 ('33||1024||20||12', '44||1111||20||1'),
 ('33||1024||20||13', '44||1111||20||2'),
 ('33||1024||20||14', '44||1111||20||3'),
 ('33||1024||20||15', '44||1111||20||4'),
 ('33||1024||20||16', '44||1111||20||5'),
 ('33||1024||20||17', '44||1111||20||6')]

Draw Again

In [8]:
graph.draw()
Out[8]:
G cluster_Left Columns cluster_Right Attributes 33||1024||20||1 catsid_col 33||1024||20||2 catpid_col 33||1024||20||1->33||1024||20||2 44||1111||20||1 supID 33||1024||20||1->44||1111||20||1 33||1024||20||3 catcost_col 33||1024||20||2->33||1024||20||3 44||1111||20||7 prtID 33||1024||20||2->44||1111||20||7 33||1024||20||4 catqnt_col 33||1024||20||3->33||1024||20||4 44||1111||20||12 catPrice 33||1024||20||3->44||1111||20||12 33||1024||20||5 catdate_col 33||1024||20||4->33||1024||20||5 44||1111||20||13 catTotal 33||1024||20||4->44||1111||20||13 33||1024||20||6 catchk_col 33||1024||20||5->33||1024||20||6 44||1111||20||14 catDate 33||1024||20||5->44||1111||20||14 33||1024||20||7 pid_col 33||1024||20||6->33||1024||20||7 44||1111||20||15 catCheck 33||1024||20||6->44||1111||20||15 33||1024||20||8 pname_col 33||1024||20||7->33||1024||20||8 33||1024||20||7->44||1111||20||7 33||1024||20||9 pcolor_col 33||1024||20||8->33||1024||20||9 44||1111||20||8 prtName 33||1024||20||8->44||1111||20||8 33||1024||20||10 pweight_col 33||1024||20||9->33||1024||20||10 44||1111||20||9 prtColor 33||1024||20||9->44||1111||20||9 33||1024||20||11 punit_col 33||1024||20||10->33||1024||20||11 44||1111||20||10 prtWeight 33||1024||20||10->44||1111||20||10 33||1024||20||12 sid_col 33||1024||20||11->33||1024||20||12 44||1111||20||11 prtUnit 33||1024||20||11->44||1111||20||11 33||1024||20||13 sname_col 33||1024||20||12->33||1024||20||13 33||1024||20||12->44||1111||20||1 33||1024||20||14 saddress_col 33||1024||20||13->33||1024||20||14 44||1111||20||2 supName 33||1024||20||13->44||1111||20||2 33||1024||20||15 scity_col 33||1024||20||14->33||1024||20||15 44||1111||20||3 supAddress 33||1024||20||14->44||1111||20||3 33||1024||20||16 scountry_col 33||1024||20||15->33||1024||20||16 44||1111||20||4 supCity 33||1024||20||15->44||1111||20||4 33||1024||20||17 sstatus_col 33||1024||20||16->33||1024||20||17 44||1111||20||5 supCountry 33||1024||20||16->44||1111||20||5 44||1111||20||6 supStatus 33||1024||20||17->44||1111||20||6 44||1111||20||1->44||1111||20||2 44||1111||20||2->44||1111||20||3 44||1111||20||3->44||1111||20||4 44||1111||20||4->44||1111||20||5 44||1111||20||5->44||1111||20||6 44||1111||20||6->44||1111||20||7 44||1111||20||7->44||1111||20||8 44||1111||20||8->44||1111||20||9 44||1111||20||9->44||1111||20||10 44||1111||20||10->44||1111||20||11 44||1111||20||11->44||1111||20||12 44||1111||20||12->44||1111||20||13 44||1111||20||13->44||1111||20||14 44||1111||20||14->44||1111||20||15

Map Tables on Entities

Left Side - Tables

In [9]:
ldf = triad.get(what='Data Resources', sys=1024, con=10)

Right Side - Entities

In [10]:
rdf = triad.get(what='DME', sys=1111, con=10)

Map Tables on Entities

In [11]:
graph=triad.set_mapping(left_df=ldf.iloc[1:],  right_df=rdf.iloc[1:], left_heading='Tables', right_heading='Entities')
graph.draw()
Out[11]:
G cluster_Left Tables cluster_Right Entities 33||1024||10||1 Catalogue.tsv 33||1024||10||2 Parts.tsv 33||1024||10||1->33||1024||10||2 33||1024||10||3 Suppliers.tsv 33||1024||10||2->33||1024||10||3 44||1111||10||1 Supplier 44||1111||10||2 Part 44||1111||10||1->44||1111||10||2 44||1111||10||3 Catalog 44||1111||10||2->44||1111||10||3

Add Mapping

In [12]:
matching_pairs = [('Catalogue.tsv', 'Catalog'), ('Suppliers.tsv', 'Supplier'), ('Parts.tsv', 'Part')]
In [13]:
triad.add(what='Mapping Data Resource', label_pairs=matching_pairs, pydot_graph=graph)
Out[13]:
[('33||1024||10||1', '44||1111||10||3'),
 ('33||1024||10||3', '44||1111||10||1'),
 ('33||1024||10||2', '44||1111||10||2')]
In [14]:
graph.draw()
Out[14]:
G cluster_Left Tables cluster_Right Entities 33||1024||10||1 Catalogue.tsv 33||1024||10||2 Parts.tsv 33||1024||10||1->33||1024||10||2 44||1111||10||3 Catalog 33||1024||10||1->44||1111||10||3 33||1024||10||3 Suppliers.tsv 33||1024||10||2->33||1024||10||3 44||1111||10||2 Part 33||1024||10||2->44||1111||10||2 44||1111||10||1 Supplier 33||1024||10||3->44||1111||10||1 44||1111||10||1->44||1111||10||2 44||1111||10||2->44||1111||10||3

View Data Resource with Mapping

In [15]:
triad.get(what='Data Resources', sys=1024)
Out[15]:
KEY CNT NAM ABBR MKEY MNAM MOFMKEY MOFMNAM TYPE DICTYPE PATH
0 33||1024||0||0 0 SUPPLIER PART CATALOGUE Data Resource SPC_DR None None None None DICT SYS /home/athan/Work/TRIADemo/Data/SupplierPartCat...
1 33||1024||10||0 3 TSV FLAT FILE TBL None None None None DICT STYPE None
2 33||1024||10||1 0 Catalogue.tsv Catalogue 44||1111||10||3 Catalog 55||1032||30||0 INT DICT TYPE None
3 33||1024||10||2 0 Parts.tsv Parts 44||1111||10||2 Part 55||1032||20||0 INT DICT TYPE None
4 33||1024||10||3 0 Suppliers.tsv Suppliers 44||1111||10||1 Supplier 55||1032||10||0 INT DICT TYPE None
5 33||1024||20||0 17 FIELD COL None None None None DICT STYPE None
6 33||1024||20||1 0 catsid_col catsid_col 44||1111||20||1 supID 55||1032||40||0 INT DICT TYPE None
7 33||1024||20||2 0 catpid_col catpid_col 44||1111||20||7 prtID 55||1032||100||0 INT DICT TYPE None
8 33||1024||20||3 0 catcost_col catcost_col 44||1111||20||12 catPrice 55||1032||150||0 REA DICT TYPE None
9 33||1024||20||4 0 catqnt_col catqnt_col 44||1111||20||13 catTotal 55||1032||160||0 INT DICT TYPE None
10 33||1024||20||5 0 catdate_col catdate_col 44||1111||20||14 catDate 55||1032||170||0 TS DICT TYPE None
11 33||1024||20||6 0 catchk_col catchk_col 44||1111||20||15 catCheck 55||1032||180||0 ENUM DICT TYPE None
12 33||1024||20||7 0 pid_col pid_col 44||1111||20||7 prtID 55||1032||100||0 INT DICT TYPE None
13 33||1024||20||8 0 pname_col pname_col 44||1111||20||8 prtName 55||1032||110||0 WRD DICT TYPE None
14 33||1024||20||9 0 pcolor_col pcolor_col 44||1111||20||9 prtColor 55||1032||120||0 WRD DICT TYPE None
15 33||1024||20||10 0 pweight_col pweight_col 44||1111||20||10 prtWeight 55||1032||130||0 REA DICT TYPE None
16 33||1024||20||11 0 punit_col punit_col 44||1111||20||11 prtUnit 55||1032||140||0 ENUM DICT TYPE None
17 33||1024||20||12 0 sid_col sid_col 44||1111||20||1 supID 55||1032||40||0 INT DICT TYPE None
18 33||1024||20||13 0 sname_col sname_col 44||1111||20||2 supName 55||1032||50||0 WRD DICT TYPE None
19 33||1024||20||14 0 saddress_col saddress_col 44||1111||20||3 supAddress 55||1032||60||0 TXT DICT TYPE None
20 33||1024||20||15 0 scity_col scity_col 44||1111||20||4 supCity 55||1032||70||0 WRD DICT TYPE None
21 33||1024||20||16 0 scountry_col scountry_col 44||1111||20||5 supCountry 55||1032||80||0 WRD DICT TYPE None
22 33||1024||20||17 0 sstatus_col sstatus_col 44||1111||20||6 supStatus 55||1032||90||0 INT DICT TYPE None