funnelweb ploneconf2010
DESCRIPTION
PloneConf2010 talk about easy content conversion framework called funnelweb. Makes importing any site easy.TRANSCRIPT
![Page 2: Funnelweb ploneconf2010](https://reader033.vdocuments.mx/reader033/viewer/2022052907/558eced51a28ab761c8b45d8/html5/thumbnails/2.jpg)
[email protected] Conf 2010 Dylan Jay
Content Conversions suck
Large existing sites Static html or old CMS Hard to quote on Content audit Use plone to fix content Convert Docs to Pages (coming...)
![Page 3: Funnelweb ploneconf2010](https://reader033.vdocuments.mx/reader033/viewer/2022052907/558eced51a28ab761c8b45d8/html5/thumbnails/3.jpg)
[email protected] Conf 2010 Dylan Jay
History
2008 - Obrien Intranet 2009 – pretaweb.funnelweb (deprecated)
Plone UI > Actions > Import 2010 – transmogrify.* release on pypi 2010 – collective.developermanual
sphinx to plone 2010 – funnelweb Recipe + Script Thanks – Dylan Jay, Vitaliy Podoba, Rok Garbas, Mikko Ohtamaa, Tim
Knap
![Page 5: Funnelweb ploneconf2010](https://reader033.vdocuments.mx/reader033/viewer/2022052907/558eced51a28ab761c8b45d8/html5/thumbnails/5.jpg)
[email protected] Conf 2010 Dylan Jay
funnelweb.recipe
Add to buildout
[funnelweb]
recipe = funnelweb
crawler-url=http://www.whitehouse.gov
![Page 6: Funnelweb ploneconf2010](https://reader033.vdocuments.mx/reader033/viewer/2022052907/558eced51a28ab761c8b45d8/html5/thumbnails/6.jpg)
[email protected] Conf 2010 Dylan Jay
bin/funnelweb
Crawls Caches locally Filters Removes template Restructures Determines title,hidden etc Uploads to plone
![Page 7: Funnelweb ploneconf2010](https://reader033.vdocuments.mx/reader033/viewer/2022052907/558eced51a28ab761c8b45d8/html5/thumbnails/7.jpg)
[email protected] Conf 2010 Dylan Jay
Common Options
crawler:site_url crawler:ignore ploneupload:target template1:description template1:text *-disable
![Page 8: Funnelweb ploneconf2010](https://reader033.vdocuments.mx/reader033/viewer/2022052907/558eced51a28ab761c8b45d8/html5/thumbnails/8.jpg)
[email protected] Conf 2010 Dylan Jay
Command Line
bin/funnelweb --crawler:max=50 --localupload:output=var/funnelwebdebug
![Page 10: Funnelweb ploneconf2010](https://reader033.vdocuments.mx/reader033/viewer/2022052907/558eced51a28ab761c8b45d8/html5/thumbnails/10.jpg)
[email protected] Conf 2010 Dylan Jay
Custom pipeline
bin/funnelweb –pipeline > pipeline.cfg {edit} pipeline.cfg bin/funnelweb --pipeline=pipeline.cfg
![Page 11: Funnelweb ploneconf2010](https://reader033.vdocuments.mx/reader033/viewer/2022052907/558eced51a28ab761c8b45d8/html5/thumbnails/11.jpg)
[email protected] Conf 2010 Dylan Jay
Making your own blueprint
class MyBlueprint(object):
classProvides(ISectionBlueprint)
implements(ISection)
def __init__(self, transmogrifier, name, options, previous):
self.previous = previous
def __iter__(self):
for item in self.previous:
dosomethingto(item)
yield item
<utility component=".myblueprint.MyBluePrintr"
name="transmogrify.myblueprint" />
![Page 12: Funnelweb ploneconf2010](https://reader033.vdocuments.mx/reader033/viewer/2022052907/558eced51a28ab761c8b45d8/html5/thumbnails/12.jpg)
[email protected] Conf 2010 Dylan Jay
transmogrify.webcrawler
transmogrify.webcrawler Crawls site or cache for content
transmogrify.webcrawler.typerecognitor Sets Plone content type based on mime-type
transmogrify.webcrawler.cache Saves content to disk
![Page 13: Funnelweb ploneconf2010](https://reader033.vdocuments.mx/reader033/viewer/2022052907/558eced51a28ab761c8b45d8/html5/thumbnails/13.jpg)
[email protected] Conf 2010 Dylan Jay
transmogrify.htmlcontentextractor
transmogrify.htmlcontentextractor Provide XPath for title, description, text etc.
transmogrify.htmlcontentextractor.auto Guesses XPaths from content
![Page 14: Funnelweb ploneconf2010](https://reader033.vdocuments.mx/reader033/viewer/2022052907/558eced51a28ab761c8b45d8/html5/thumbnails/14.jpg)
[email protected] Conf 2010 Dylan Jay
transmogrify.siteanalyser
transmogrify.siteanalyser.relinker Moves, renames, url tidying
transmogrify.siteanalyser.title Guess page titles
transmogrify.siteanalyser.defaultpage Move index pages into folders
transmogrify.siteanalyser.attach Move attachments closer to pages
![Page 15: Funnelweb ploneconf2010](https://reader033.vdocuments.mx/reader033/viewer/2022052907/558eced51a28ab761c8b45d8/html5/thumbnails/15.jpg)
[email protected] Conf 2010 Dylan Jay
transmogrify.ploneremote
Remoteconstructor Adds content to plone via xmlrpc
Remoteschemaupdater Updates content of existing object
Remotenavigationexcluder Hides content not in orginal sites navigation
Remoteworkflowupdater Publish content
Remoteredirector Creates aliases for items that have moved
![Page 16: Funnelweb ploneconf2010](https://reader033.vdocuments.mx/reader033/viewer/2022052907/558eced51a28ab761c8b45d8/html5/thumbnails/16.jpg)
[email protected] Conf 2010 Dylan Jay
Other blueprints
transmogrify.pathsorter Puts folders before content and content in
right order collective.transmogrifier.sections.condition
Useful to drop certain content
![Page 17: Funnelweb ploneconf2010](https://reader033.vdocuments.mx/reader033/viewer/2022052907/558eced51a28ab761c8b45d8/html5/thumbnails/17.jpg)
[email protected] Conf 2010 Dylan Jay
Where to get it
http://github.com:djay/funnelweb.git http://github.com:djay/transmogrify.* Pypi release TBA