a general architecture for finding structural regularities on the web
TRANSCRIPT
&12
&14
&44 &15 &16
&35&19
&17 &13 &66 &17 &23 &25
&54
&55
"Chef Chu"
"El CaminoReal"
"Palo Alto" "92310"
"gourmet" "Saigon" "MontainView"
"Menlo Park" "Cheap" "Fast food" "McDonald’s"
price
zipcode
restaurantrestaurant
name
restaurant
name address
street zipcodecity
category category address address
&77
namecategory
&79 &80
"Viatnamese"
"92310"
nearby
nearby
price
identityidentity
firstnamename
street zipcode
identity
person
companyname address address
Director
id address
Source 1
Source 2
Source n
T1person...
......
T2 person{name ..
T3 person{ident ..
T4person{id : < ...
Large File
AssociatedOEM Graph
RESULT
repository
dictionnary
mapping values
MiningDataMapping Frequent
Paths