java. blogs “the movement, the site, the technology” prepared by mike cannon-brookes - june,...
TRANSCRIPT
java.blogs“The movement, the site, the technology”
Prepared by Mike Cannon-Brookes - June, 2003
[email protected] - http://www.atlassian.com
2
Agenda
1. Blogging & Java
2. What is java.blogs?
3. Technology behind javablogs.com
3
What is a weblog?
• “News site” written by one person
• Online, chronological journal
• Personal list of thoughts, links etc
• Newest entries at the top of the page
• Usually has an RSS version
• Hard to categorize, easier to show you an example
4
Posts
Typical Blog Pieces
Silly Title
Ugly Author Pic
5
Posts
Nav
Search
Blog Roll
About
6
Why read blogs?
• Get a personal view point• Bloggers usually discuss latest content /
ideas• In the Java world, often a good source of
solutions and personal experience– Example: commons-logging & log4j
• Comparison: best feature on Amazon?– book reviews!
7
Why write a blog?
• Everyone has their own reasons
• A place for personal expression
• Share your thoughts, wisdom
• Feel part of a community
• Rant / rave / vent / discuss– BileBlog perfect example
8
RSS
• Really Simple Syndication? • Rich Site Summary?
• RSS is the “protocol” of the syndication world • An XML format representing a list of links• Used to syndicate weblogs in a machine readable
format
• It is a very simple format…
9
RSS Example
<rss><channel>
<title>rebelutionary</title><link>http://blogs.atlassian.com/rebelutionary/</link><description>Mike Cannon-Brookes on J2EE, Java, software development, bug tracking, Atlassian, JIRA and whatever comes to mind.</description>
<item><title>The BileBlog is painfully accurate</title><link>http://blogs.atlassian.com/rebelutionary/archives/000166.html</link><description>Let me start by saying …</description>
</item><item>
<title>Two Weeks Left to TSS </title><link>http://blogs.atlassian.com/rebelutionary/archives/000165.html</link><description>Less than two weeks left until TSS Symposium... </description>
</item>. . .
</channel></rss>
10
Blog Syndication
Blog 1 Blog 2Blog 3
News Aggregator
RSS(XML)
RSS(XML)
HTMLHTML
Reader
11
News Aggregators
• Software to aggregate news from your personal list of favourite sites.
• Popular:
– AmphetaDesk (Perl)
– NetNewsWire (OSX)
– NewsCrawler (Win)
– NewsMonster (Moz)
• java.blogs is a ‘web based’ news aggregator for the Java community.
12
Blogging using Java
• Lots of blog software written in Java
• 3 biggest projects:– Roller, SnipSnap and Blojsom
• Other projects include:– PersonalBlog, MiniBlog, CocoBlog etc.
13
Tools: Roller
• www.rollerweblogger.org• The most advanced Java-based blogging tool• Supports comments, templating, RSS, XML-RPC,
multi-user.• Deploys on Servlet container & JDBC database• Technologies used:
– Persistence: Hibernate or Castor– MVC: Struts– Views: Velocity
14
Tools: SnipSnap
• www.snipsnap.org
• Innovative Java-based ‘Bliki’ (Blog + Wiki)
• Features include a complex Wiki rendering engine, comments, search, RSS, XML-RPC
• Standalone Jetty instance with McKoi DB
• Technologies:– Views: JSP– Everything else hand rolled!
15
Tools: Blojsom
• blojsom.sf.net• Java port of Bloxsom (Perl-based tool)• Uses the file system for persistence• Supports comments, RSS, XML-RPC,
referrer tracking, searching.• Runs on JDK 1.4 w/ Servlet 2.3 container.• Technologies:
– Flexible view dispatcher: JSP, Velocity or Freemarker
16
Tools: Wrap Up
• If you’re looking to start a new blog, without software - try freeroller
http://www.freeroller.net
• No installation required
• Completely web based
• Good way to get blogging!
17
Agenda
• Blogging & Java
• What is java.blogs?
• Technology behind javablogs.com
18
java.blogs Overview
• www.javablogs.com• A web aggregator for Java focused weblogs• Currently aggregates:
– Over 1.5 million words– Written in 18,000+ entries– Posted by 360+ bloggers
19
20
Information Flows
Blogs(Internet)
Readers
Browse Website
RSS Feeds
java.blogs
Bloggers
Thoughts
Aggregated RSS
21
java.blogs Features
• Single ‘feed’ of all Java blog entries• Decentralized, uncontrolled, focused community
– Each blog owner controls their own blog
• Tracks “popular” entries– Popular entries show what the community reads
• Text & date searchable content– eg find all entries about “AOP” this week
• Daily email notifications– Keep up to date with most popular entries from within
your email client
22
java.blogs daily update
23
How is it different?
• It is a true community of equals
• Find a wide range of views on any topic– eg Over 30 entries discussing the impact of
java.net within 24 hours of launch.
• There is no ‘agenda’ - just personal views
• “Good entries” rise to the top
• More news than you can ever want– Now averaging over 200 entries / day
24
java.blogs
• 350 authors• 1000 users• Users in total control
– This is good and bad!
• Content is personal– 100% opinions
• Analogy: having a conversation with 350 developers
• Few authors • 290,000 users• Centralised editorial
– Quality assurance
• Content is controlled– Opinions in comments
• Analogy: reading a newspaper or magazine
TSSvs
25
java.net?
• java.net contains Sun ‘by-invitation only’ weblogs
• This is good for Javablogs.com!• Many new blogs to aggregate (java.net has RSS):
– James Gosling,
– Sam Ruby,
– Mike Clark, etc
• See http://weblogs.java.net/ for more information.
26
Agenda
• Blogging & Java
• What is java.blogs?
• Technology behind javablogs.com
27
javablogs.com Components
DB(Postgres)
Persistence(OFBiz)
MVC(WebWork)
Presentation(SiteMesh,WebWork
& JSP)
Scheduling(Quartz &Atlassian
Scheduler)
java.blogscode
(Java!)
Mail(JavaMail &
Velocity)
Search &Indexing(Lucene)
XML / SOAP(Electric XML & Glue)
UserManagement
(OSUser)
Web
Non-Web
28
Component Tour
Persistence MVC
Presentation
Scheduling
MailIndexing
Security
29
Component Tour: Scheduling
Persistence MVC
Presentation
Scheduling
MailIndexing
Security
Components: Quartz & atlassian-scheduler
30
Scheduling: Quartz
• www.part.net/quartz.html• J2EE job scheduling system• Alternatives?
– Threads, util.Timer, external cron
• Operates around a job/trigger model• Built in period and cron triggers• Extremely flexible, friendly component
• USE: scheduling RSS retrieval, sending daily email updates, any periodic task
• atlassian-scheduler is a Quartz XML config…
31
Scheduling: Atlassian Scheduler
• Example: scheduler-config.xml <scheduler>
<jobs><job name="Updater" class="com...UpdateJob" /> <job name="DailyEmail" class="com...EmailJob" />
</jobs><triggers>
<trigger name="UpdaterTrigger" job="Updater"><period>3m</period><startDelay>3m</startDelay>
</trigger><trigger name=”MailTrigger" job="DailyEmail" type="cron">
<startDelay>1m</startDelay><expression>00***?</expression>
</trigger></triggers>
</scheduler>
32
Component Tour: Persistence
Persistence MVC
Presentation
Scheduling
MailIndexing
Security
Component: OFBiz Entity Engine
33
Persistence: OFBiz EE
• www.ofbiz.org - Open For Business project
• “Different” persistence engine
• Light wrapper around JDBC datasources
• Proud to be relational!
• Data entities are generic value objects
• Entity, field, view and relation definitions are in XML
• Use: all data persistence in Javablogs.
34
Persistence: OFBiz EE
• Retrieving all blogs:Collection blogs = ee.findAll(“Blog”);
• Retrieving a specific blog entity:Map idField = UtilMisc.toMap(“id”, id);GenericValue blog = ee.findByPrimaryKey("Blog", idField);
• Getting and setting a field value:GenericValue blog = retrieveBlog();String url = blog.getString(“url”);blog.set(“url”, newUrl);blog.store();
35
Persistence: OFBiz EE
• Retrieving blog entries:Collection entries = blog.getRelated(“ChildEntry”);
• Complex find operation:Map fields = UtilMisc.toMap(“author”, “bob”);List sortorders = UtilMisc.toList(“dateadded”);entries = ee.findByAnd(“Blog”, fields, sortorders);
36
• Automatically creates and updates tables• Code talks to the logical model• Logical model + field mappings = physical db
Persistence: OFBiz EE
Logical Modelentitymodel.xml
Field Mappingsfieldtypes-oracle.xml
javablogs code
PhysicalDatabase
OFBiz EE
37
Persistence: OFBiz EE
• Sample entity from entitymodel.xml<entity entity-name= "BlogEntry" package-name= "" >
<field name= "id" type= "numeric" col-name= ”entry_id" /> <field name= "blog" type= "numeric" /> <field name= "title" type= "long-varchar" /> <field name= "created" type= "date-time" /> ...<prim-key field= "id" /> <relation type= "one" title= "Parent" rel-entity-name= "Blog" >
<key-map field-name= "blog" rel-field-name= "id" /> </ relation >
</ entity >
38
Persistence: OFBiz EE
• Sample of fieldtypes-postgres.xml<fieldtypemodel>
<field-type-def type= "date-time" sql-type= "TIMESTAMP" java-type= "java.sql.Timestamp" />
<field-type-def type= "short-varchar" sql-type= "VARCHAR(60)" java-type= "String" />
<field-type-def type= "very-long" sql-type= "TEXT" java-type= "String" />
<field-type-def type= "credit-card-number" sql-type= "VARCHAR(40)" java-type= "String">
<validate name= "isAnyCard"/></ field-type-def>
39
Component Tour: Indexing
Persistence MVC
Presentation
Scheduling
MailIndexing
Security
Component: Jakarta Lucene
40
Indexing: Lucene
• jakarta.apache.org/lucene• The full-text index and search component• Highly scalable and efficient architecture
• Why use full text searching?– No more SQL: LIKE ‘%foo%’
• Used for full text search of blog entries
41
Component Tour: Security
Persistence MVC
Presentation
Scheduling
MailIndexing
Security
Components: OSUser & atlassian-security
42
User Management: OSUser
• www.opensymphony.com/osuser
• Server-agnostic user management API
• Handles user, group and profile data
• Why needed?– User data different to other data– Pluggable storage providers (ie LDAP)– Writing portable J2EE auth code very hard
43
User Management: OSUser
• Providers for storage:– EJB, JDBC, OFBiz, Hibernate, LDAP, XML files etc
• Adapters for server integration– Weblogic, Orion, JBoss, Resin etc
OSUser
javablogs code
Storage-users
-groups-profiles
ApplicationServer
adapters providers
44
User Management: OSUser
• Retrieving a specific user and their groups:User user = userManager.getUser(“fred”);Collection groups = user.getGroups();
• Retrieving a user’s profile:PropertySet profile = user.getPropertySet();String country = profile.getString(“user.address.country”);Date signedUp = profile.getDate(“user.signup”);Long logins = profile.getLong(“user.numlogins”);
• Creating a new user in the administrators group:User newUser = userManager.createUser(“mike”);newUser.setEmail(“[email protected]”);newUser.addToGroup(userManager.getGroup(“administrators”));
45
Security: atlassian-security
• Security is worst part of Servlet spec– Not portable between servers
– Very limited URL patterns
• To perform effective security checks on a web application, need to roll your own framework.
• atlassian-security is our solution to this– in process of being open sourced
46
atlassian-security architecture
Security Services Calculate the authorisation needed for a particular request
Interceptors Interject code to run before / after security events
Authenticator Authenticate a user (login, logout, retrieve) against a user system
Controller Governs whether security checks are enabled or disabled
SecurityFilter Calculates roles required for this request against all configured security services
LoginFilter Looks for credentials (eg username / password) in the request and tries to log user in if found
Integration Points
Concepts
47
atlassian-security Services
• Services allow you to check the roles required for any request
• Two bundled services:– Path Service: looks at the request URL. Allows
extremely flexible path lookups• /admin/*, **/admin/*, /admin/Setup* etc
– WebWork Service: looks at the action being executed
• Other examples could be IPService or KeyService.
48
atlassian-security PathService
• PathService is configured in a security-paths.xml file:
<security-paths><path name= "admin">
<url-pattern>/secure/admin/*</url-pattern><role-name>administrators</role-name>
</path><path name= "user">
<url-pattern>/secure/*</url-pattern><role-name>users, developers</role-name>
</path></security-paths>
49
atlassian-security Interceptors
• Problem: run code before / after security events– Events? login, logout, auth attempts etc.– Under the Servlet spec you can’t do this.
• Solution: AOP-like interceptors
• Example: used in java.blogs to store the date of last login, and number of logins
50
Component Tour: MVC
Persistence MVC
Presentation
Scheduling
MailIndexing
Security
Component: OpenSymphony WebWork
51
MVC: WebWork
• www.opensymphony.com/webwork• MVC framework• Implementation of GoF command pattern
– WW actions are ‘method objects’
• Used for all actions in javablogs.com, with JSP views
• “WebWork 2” presentation later today for detailed information.
52
MVC: WebWork
• Simple, clean APIs– No more formbeans
• Field or model driven actions
• Actions are easily testable
• View agnostic– Supports JSP, Velocity, XML, JasperReports
• Let’s look at a brief example…
53
MVC: WebWork - Result
54
MVC: WebWork - Action
public class AddBlog extends WebsiteActionSupport { // fields
// getters & setters …
protected void doValidation() { if (url != null && !url.startsWith("http://")) url = "http://" + url;
if (url == null) addError("url", ”Please fill in a blog URL"); }
protected String doExecute() { try { BlogUtils.addBlog(url, getRemoteUser()); return SUCCESS; } catch (BlogException e) { addErrorMessage(e.getMessage()); return ERROR; } }}
55
MVC: WebWork - Config
• actions.xml is where actions are configured:
<action name="user.AddBlog" alias="AddBlog"> <view name=”success"> /secure/views/user/addblog2.jsp </view> <view name=”error"> /secure/views/user/addblog.jsp </view></action>
56
MVC: WebWork - View
<html><head><title>Add Blog</title></head>
<body> <page:applyDecorator name="jbform"> <page:param name="description"> <b>Step 1 of 2</b>: Specify the url of the blog to add. </page:param> <page:param name="action">AddBlog.jspa</page:param> <page:param name="submitName">Add</page:param> <page:param name="cancelURI">ViewProfile.jspa</page:param>
<page:param name="width">100%</page:param>
<ui:textfield label="'RSS Feed URL'" name="'url'" size="'70'"/> </page:applyDecorator></body></html>
57
MVC: WebWork - Error
58
Component Tour: Presentation
Persistence MVC
Presentation
Scheduling
MailIndexing
Security
Component: OpenSymphony SiteMesh
59
Presentation: SiteMesh
• www.opensymphony.com/sitemesh• Page layout and decoration engine• Clean separation of content vs presentation
– Designers design presentation– Developers develop content
• Applies GoF decorator pattern to HTML– Metaphor: Swing look and feel changer
60
Decorated page: Green sections are put there by the decorator.
61
Un-decorated page: Notice that there is no header / footer, and no style sheet.
62
Another page: using a TSS decorator - no GIF trickery!http://www.javablogs.com/Welcome.jspa?decorator=tss
63
Presentation: SiteMesh
Result(HTML)
1. render content
4. Mergecontent &
presentation
Content(HTML
fragment)
ContentSource
(JSP, Perl, PHP, HTML etc)Field Map
2. parse HTML
SiteMesh
Presentation(Decorator -
JSP)
DecoratorMappers
3. select decoratorfor request
64
SiteMesh: Raw HTML
<html><head>
<title>About JavaBlogs</title><meta name=“section” content=“About”>
</head>
<body bgcolor=“#ffffff”>JavaBlogs <b>aggregates</b> the blogs of Java bloggers.
</body></html>
• Benefit: keeps your HTML simple
65
SiteMesh: Parsed Field Map
• Turns your HTML into a map of fields
Key Value
title About JavaBlogs
meta.section About
body JavaBlogs <b>aggregates</b> the blogs of Java blogges
body.bgcolor #ffffff
66
SiteMesh: Decorator JSP
<%@ taglib uri= "sitemesh-decorator" prefix= ”dec" %>
<html><head><title>java.blogs - <dec:title /></title>
</head><body bgcolor="<dec:getProperty property= ”body.bgcolor" />">
<h2 class= "pagetitle"><dec:title /></h2>
<dec:isPropertySet name=“meta.section”> You are in the <dec:getProperty property=“meta.section”> section.<br></dec:isPropertySet>
<p><dec:body /></p></body></html>
• Decorators are generally JSP pages, and use a single sitemesh-decorator taglib
• Benefit: simple for designers to understand
67
SiteMesh: Result Code
<html><head><title>java.blogs - About JavaBlogs</title>
</head><body bgcolor=”#ffffff">
<h2 class= "pagetitle">About JavaBlogs</h2>
You are in the About section.<br>
<p>JavaBlogs <b>aggregates</b> the blogs of Java bloggers.</p>
</body></html>
• Resulting code is plain HTML again
68
Presentation: SiteMesh
• How are decorators chosen?– Chosen on meta data in the fields map and request
objects
• Mapping is decoupled from pages themselves– No more fragile <jsp:include .. /> statements
PageDM Uses <meta> properties
FrameSetDM Handles framed sites
PrintableDM For making printable versions
RobotsDM Serves robots special decorators
ParameterDM Uses request parameters
ConfigDM Uses decorators.xml and URL paths
69
Presentation: SiteMesh
• Most often use ConfigDecoratorMapper - decorators.xml:
<decorators><decorator-mapping decorator="none">
<url-pattern>/styles/*</url-pattern></decorator-mapping>
<decorator name="admin” page="/decorators/admin.jsp"><url-pattern>/secure/admin/*</url-pattern>
</decorator></decorators>
70
Component Tour: Mail
Persistence MVC
Presentation
Scheduling
MailIndexing
Security
Component: Jakarta Velocity
71
Mail: Velocity
• Problem: generating email content is not as easy as HTML content
• We use Velocity to generate text and HTML email
• Alternatives?– JSP pages with wget, String replacement, Java code!
72
Mail: Velocity
• http://jakarta.apache.org/velocity
• Text templating engine from Jakarta
• Advantages
– Extremely fast to generate lots of text
– Simple syntax to learn
– Plugs into your existing model easily
• Merges in a similar way to SiteMesh
– Separates content & presentation
MergedResult
VelocityTemplate
VelocityContext
(Map of fields)
73
Mail: Velocity - Template
• Here is text/dailyemail.vm as a sample:
java.blogs daily update-----------------------Here is a list of the most popular blog entries from the last 24 hours on ${baseurl}, brought to you by the world's best issue tracker:
JIRA 2 - http://www.atlassian.com/software/jira - enjoy! :)
Entries-------#foreach ($entry in $entries)
#if (${entry.getString("link")})${entry.getString("title")}http://www.javablogs.com/Jump.jspa?id=${entry.getLong("id")}${entry.getRelatedOne("ParentBlog").getString("title")}
#end#end
74
Mail: Velocity - Result
•Resulting email looks like:
java.blogs daily update-----------------------Here is a list of the most popular blog entries from the last 24 hourson http://www.javablogs.com/Welcome.jspa, brought to you by the world's best issue tracker:
JIRA 2 - http://www.atlassian.com/software/jira - enjoy! :)
Entries-------OSCache: You too can pretend you're competent!http://www.javablogs.com/Jump.jspa?id=38371The BileBlog
BlueOxygen Java Open Source CollectionWow,http://www.javablogs.com/Jump.jspa?id=38374Frans Thamura's Blogger
. . .
75
Component Tour: Build System
Persistence MVC
Presentation
Scheduling
MailIndexing
Security
Component: Apache Maven
Build System
76
Build System: Maven
• Most popular Java build tool - Ant
• Problems with Ant?– lots of duplicate code
• we don’t tolerate duplication in our Java code, why tolerate it in our build.xml?!
– no dependency management
• Maven is one attempt at an Ant alternative
77
Build System: Maven
• maven.apache.org
• Build system evolving from Ant– can still call any Ant task from Maven
• Aim: to separate meta data from function
• Still very ‘beta’
• Plugins to do many common tasks– ie compile, JAR, clean syntax, upload files
78
Build System: Maven
• Manages and downloads dependencies
<dependency> <id>velocity</id> <version>1.3</version> <properties> <war.bundle.jar>true</war.bundle.jar> </properties> </dependency>
• Each developer maintains a local repository– One copy of each JAR on your system
• Dependencies not found locally are downloaded– example above will download velocity-1.3.jar
79
Important Lessons
• Performance vs caching– Cache 20% of pages that generate 80% of load– OSCache used heavily to cache web tier– Gzip filter to reduce bandwidth
• javablogs.com churns > 1GB traffic/day
• Working in a fragile environment– Users complained that javablogs wasn’t working– Problem: RSS feeds can go down / be malformed– Solution: Provide feedback on errors to users directly
80
Future Challenges
• Challenges: – How do we decide what a ‘good’ post is?– Should we display non-Java posts?
• Solution: Algorithmically calculate ‘best’ posts– Needs to be simple and as automated as possible– Distributed moderation framework?– What metrics to use?
• Views vs user votes vs java-ness
• Solution: Bayesian filtering of posts– Solve the java vs non-java problem– Classifier4J - http://classifier4j.sourceforge.net/
81
Want More?
• The site - http://www.javablogs.com
• My blog - http://blogs.atlassian.com/rebelutionary
• Email - [email protected]
Thank you for listening - questions?
Mike Cannon-Brookes
ATLASSIAN - www.atlassian.com