muves3 elastic grid java one2009 final
TRANSCRIPT
Cloud Computing and the NetBeans IDE Enable the Army Research Laboratory’s Next-Generation Simulation System.
Ron BowersArmy Research Laboratory
Dennis ReedyElastic Grid LLC.
3 June 2009
Agenda
> Where we work and what we do> Overview of the architecture> Why and how we are using Cloud Computing> Next steps and follow up
The Army Research Laboratory
> The Army Research Laboratory (ARL) is the Army's corporate basic and applied research laboratory. Our mission is to provide innovative science, technology, and analysis to enable full-spectrum operations.
> We represent the Survivability/Lethality Analysis Directorate (SLAD) of ARL.
We have some experience with computers...
Ensure that US personnel and equipment…
Nuclear, Biological and Chemical Attack
Ballistic Threats Electronic Warfare Information Warfare
…survive and function effectively in hostile circumstances
SLAD Mission
SLAD performs both experimentation and modeling
SLAD Ballistic Vulnerability/Lethality (V/L) Modeling> SLAD’s primary tool for performing ballistic V/L analysis is
MUVES.> MUVES development began in 1984.> The current version (2.16) is a single-threaded C
application.> We are currently developing MUVES 3, which is an all-new
replacement system.
Program Focus
> Provide the next generation of simulation system for the V/L analyst community Mostly Java. Dynamic distributed and service-oriented. Will support over 100 concurrent users. Incorporates a computational grid, parallelized system that
distributes tasks and computes results that are graphically displayed.
Will operate in both “batch” and interactive modes.
Interesting Challenges
> We have few servers, but many powerful workstations. Architecture must exploit analyst community machines Share CPU, memory and disk Heterogeneous deployment environment
> Required application assets vary Need real-time provisioning of application assets Must be able to route functionality to machines that are best
capable of executing tasks/functions Must be able to scale on demand based on real time need
and use of the system> Legacy of performance issues and nightmares
Solution Approach
> Choose technology that embraces dynamic distributed capabilities
> Craft a loosely coupled service oriented architecture that segments the system into functional roles
> Choose persistence technologies and approaches that allow for low latency and high concurrency
> Represent data as it moves through the system In-flight (hot in-memory), Swap, Long Term, Archived
> Keep Disk I/O out of the main stream processing
Application Infrastructure
Quality of Service
Monitoring and Management
Domain-specific Services and Algorithms
Dynamic Container
Persistence Management
What’s Underneath
Application Infrastructure
Quality of Service
Monitoring and Management
Domain-specific Services and Algorithms
Dynamic Container
Persistence Management
JavaSpace Apache Active MQ
Apache Derby
What’s Underneath
Application Infrastructure
Quality of Service
Monitoring and Management
Domain-specific Services and Algorithms
Dynamic Container
Persistence Management
Rio
What’s Underneath
Application Infrastructure
Quality of Service
Monitoring and Management
Domain-specific Services and Algorithms
Dynamic Container
Persistence Management
Gomez
What’s Underneath
Gomez
> Was established as a prototype for MUVES 3 architectural enhancements.
> Now forms the basis of the MUVES 3 service-oriented architecture (SOA).
> Includes all of the non-sensitive services used by MUVES 3.
> Is an open source project (LGPLv3).
Application Infrastructure
Quality of Service
Monitoring and Management
Domain-specific Services and Algorithms
Dynamic Container
Persistence Management
What’s Underneath
• Attachment point for clients.• Monitors system load. • Controls job submission.
• Attachment point for clients.• Monitors system load. • Controls job submission.
• Executes analysis jobs.
• Executes analysis jobs.
• Stores analysis results.
• Stores analysis results.
GatewayGateway
SimSim
PersistencePersistence
ClientClient
MUVES 3 UIMUVES 3 UI
MUVES 3 System Organization
MUVES 3 UI
> The MUVES 3 UI is built on the NetBeans Platform.
> Used for: Input preparation Team collaboration Job submittal and
monitoring. Result visualization
> Components that interact with back-end services are developed in Gomez and used in the MUVES 3 UI.
Our NetBeans Experience
> Favorite things Swing-based Fast form development Easy deployment via JNLP Extremely well supported by
the community
> Biggest issue Integrating libraries that are
updated frequently
MUVES 3 Execution
Persistence
Client Gateway
Sim Pool
BusySim Pool
Sim Pool
MUVES 3 Execution
Persistence
Client Gateway
Sim Pool
BusySim Pool
Sim PoolSelect Sim PoolSubmit job
Sim Pool
MUVES 3 Execution
Persistence
Client Gateway Worker Worker Worker Worker
Job MonitorTask Space
Sim Pool
MUVES 3 Execution
Persistence
Client Gateway
Ray Tracer
PersonnelVulnerability
SpecializedPhysics
VehiclePerformance
Worker Worker Worker Worker
Job MonitorTask SpaceDeploy additional services
Sim Pool
MUVES 3 Execution
Persistence
Client Gateway
Ray Tracer
PersonnelVulnerability
SpecializedPhysics
VehiclePerformance
Worker Worker Worker Worker
Job MonitorTask Space
Submit job
Store resultsVisualize Results
Dynamic Clustering
Client Proxy
Service Interface
Service
Available strategies:1. Fail-Over – Uses one service unless that service goes away.2. Round-Robin – Iterates over all discovered services. 3. Utilization – Like round-robin, but ignores services that are
low on system resources.
Service
Service
Rio
Association declaration
Service discovery
Service injection
ServiceSelectionStrategy
Proxyinjection
The Persistence Meta-service
> Stores the results of analyses.> Consists of four layers:
In-memory cache (distributed JavaSpace), Swap, (Apache Active MQ) Long-term storage (Apache Derby), and Archive (Hibernate+Oracle).
> Layers are implemented using dynamic clustering.> Supports data life-cycle management
Data storage is leased, and leases can expire. When the cache fills, results are moved to swap.
> Must work for the next 20 years.
Cloud Computing
> Public Cloud: Obvious national security issues
> Private Cloud Conceptually already there
Realtime provisioning of applications Dynamic allocation SLA based approach
Cloud Computing:Applicable for Testing? > Goals
Demonstrate the performance and scalability of MUVES 3 over dozens (or hundreds :-) ) of computers.
Execute multiple integration tests concurrently.
> Issues Small number of computers available locally for testing. Difficult coordination issues due to Army security policies.
> Approach Test the MUVES 3 architecture (Gomez) on Amazon Elastic
Compute Cloud (EC2)
Cloud Adoption Challenges
> Getting approval :)> Administrative burden
We don’t want to build AMIs or go through the time to provision an entire stack every night.
> Minimize changes Avoid developing special code and testing framework for cloud
deployment/orchestration. Ideally, transparently switch from LAN-based deployment to the
cloud.> We must preserve the dynamic distributed semantics architected in
the system: Service selection strategies Dynamic discovery semantics
> Run multiple concurrent test cases and roll up test results.
Cloud Adoption Approach
> Use Elastic Grid (EG) Eases development and deployment of Java
applications into the Cloud Provides automated management, fault
detection, and scalability for the application Allows focus on development, not cloud
infrastructure
Elastic Grid Overview> Cloud Management Fabric
Dynamically instantiate, monitor & manage application components
SLA policy driven with strategies like service scalability, relocation, fault detection & recovery, etc.
> Cloud Virtualization Layer Abstracts specific Cloud
Computing provider technology Allows portability across specific
implementations
You can deploy on Amazon EC2, Private LAN based Cloud; Eucalyptus, Sun Cloud and others soon
ApplicationMonitors
Application Agents
2 Create Clusters
Cloud Activation & Deployment
Download and deployapplication resources
Deploy
3Upload JUnit test results
4Groovy Client
Build,Create Release, UploadArchive
S31
Download Test results and post process
5
Test Cluster
S3
Cloud Activation & Deployment
Test Cluster
Test Cluster
Groovy Client
Extending Continuous Integration
> Automated build and test, both unit and integration tests> Extend this to include Continuous Deployment
If CI passes, use Test Cloud Bursting to deploy, verify and validate system
SVNRepo.
Deploy system and run tests
Retrieve and process test results
TestRunner
Next Steps
> Lots to think about
Summary
> Using the public cloud for scalability testing and verification a good choice
> Without Elastic Grid we would have a much more difficult experience (may not have done it)
> Looking to expand metrics gathering & collection> Looking to incorporate NetBeans as a client for cloud
visualization> We are looking to make this part of our permanent
development environment> How to spread the gospel of cloud computing for DoD
37
Dennis ReedyElastic Grid LLC
Ron BowersArmy Research Laboratory