everything you wanted to know about velocity (but were afraid to … · 2010. 4. 12. · everything...
TRANSCRIPT
Everything you wanted to know about
Velocity
(but were afraid to cache)(but were afraid to cache)
Scott Colestock
Marcato Partners, LLC
What is it?
Velocity is a distributed in-memory key/value cache that provides .NET developers with a way to increase
performance and scalability when writing data-centric applications.
What is it? (2)
• The combined RAM available to all servers in a
Velocity cluster is presented to Velocity clients
as a unified whole
• Any serializable CLR object can be stored• Any serializable CLR object can be stored
– Actual location within cluster is transparent
– Client is a simple key/value API at heart
• Run as a service accessed across the network
• Additional servers can be added on demand
What we’ll cover
• What motivates this product/technology
• Terms / Pictures / Concepts
• Deploy / Install Process
• A lap around the API & Admin model• A lap around the API & Admin model
• Demos
• Gotchyas
Motivation
• Data-centric applications have been the norm for a long while– Relational data
– More recently, “service-obtained” data
• Velocity is about increasing performance by bringing the data physically closer to the consumer
• Velocity is about increasing performance by bringing the data physically closer to the consumer– Reduce pressure on underlying data stores/services
• Velocity can be about storing data in value-added form (logically closer to the consumer)– Object graphs
– Output caching (not explicit in V1)
– Aggregated data in xml or other transformed formats
Motivation (2)
• Databases are always a point of high contention
as you scale out, and tuning is expensive
– Are your data retrieval sprocs getting harder to
maintain - excessive sql chops required?maintain - excessive sql chops required?
• Service calls for reference data (internal/external)
are often slow or intentionally throttled
• Caching has always been considered a solution
for these issues…
Motivation (3)
• Machine-local caching solutions (like Microsoft’s “Enterprise Library Caching Application Block”) can provide partial answer– Easy key/value API
– Flexible store (memory, disk-backed, etc.)
– Flexible expiration and eviction policy– Flexible expiration and eviction policy
• Limitations:– Limited by the memory available to a single node…
– Application recycles typically mean you lose the cache
– In a load-balanced environment, a large data set means you will frequently “miss” when attempting to load from cache…
Motivation (4)
Key 3,5,23
Machine-local caches wind
up being sparsely populated
when used with a load
balancer (if the data set has
many keys)
Load Balancer
Key 7,11,47
Key 12,16,33
Motivation (5)
• Without a distributed cache, you have no central place to update/delete
• This means you can only cache data that can afford to be stale by some time period
– If the time period is short, you need a low TTL (time-to-– If the time period is short, you need a low TTL (time-to-live, aka expiration) which means more cache misses
• You can’t cache data that must have changes visible to the system in (near) real time
• With a distributed cache, you have one cache to shoot in the event of an update/delete
– Might be able to live with no expiration
What we’ll cover
• What motivates this product/technology
• Terms / Pictures / Concepts
• Deploy / Install Process
• A lap around the API & Admin model• A lap around the API & Admin model
• Demos
• Gotchyas
Windows Server AppFabric Caching
• History: AppFabric caching was a separate component
– Public debut at TechEd 2008 (earlier?)
– Codename: Velocity– Codename: Velocity
• “Dublin” was a separate effort, focused on providing a hosting and management environment around WCF/WF
• November 2009: Technologies grouped under heading of “Windows Server AppFabric”
Relationship to Windows Azure
AppFabric• Service bus: Handle communication and authentication
for accessing applications– Expose apps through firewalls, NAT gateways, etc.
– Assist cloud-based apps talking to on-premise apps
– Other composite app scenarios; pub/sub
• Access Control Service: Allow you to avoid setting up • Access Control Service: Allow you to avoid setting up federated identity agreements just to grant partner/customer access to your cloud-based or on-premise apps.
•Today: Only common
marketing/branding with Windows
Server AppFabric.
•Later: Common services for both
Cache-Aside Pattern
• In the current version, the out-of-box support
is for the “cache-aside” pattern.
– Check cache
– If miss, retrieve data, then populate the cache– If miss, retrieve data, then populate the cache
• Lots of other patterns you might contemplate
(and simulate) with what is provided
– Read-through/Write-through
– Refresh-ahead/Write-behind
Cache-Aside Pattern
Cache Cluster
Logical Hierarchy
Server A
Cache Host A
Server B
Cache Host B
Server C
Cache Host C
Client apps work with a
single logical unit of cache
Regions can
be implicit
or explicit.
Use explicit
only for
Named Cache: Product Catalog
Default Cache
Region: Sports
Region 1 Region 3
Server process is
DistributedCacheService.exe
Caches
explicitly
created
with TTL,
expiration,
HA policy
Regions represent a partition of
data (subset of key/value pairs).
Live on one node. Unit of
replication/failover.
only for
bulk gets or
searching.
Logical Hierarchy
ID (Key) Payload
(Value)
Tags/VersionInfo
1 Foo …
2 Bar …
3 Baz …
Named Cache: Product Catalog
Default Cache
Region: Sports
Region 1
Cache Cluster
Physical Layout
Web Server A
IIS 7.x
Web Server B
IIS 7.xLoad
Balancer
Cache Server A
Cache Host
Cache Server B
• Cache servers designed to run in a domain
• Caches can have access control applied…
• Consider the nature of data stored in cache, and secure appropriately (don’t let cache be weakest link)
IIS 7.x
Web Server C
IIS 7.x
BalancerCache Host
Cache Server C
Cache Host
Combined Deployment
Web Server A
IIS 7.x
Web Server B
Cache Host
Web Server B
IIS 7.x
Web Server C
IIS 7.x
Load
Balancer Cache Host
Cache Host
Physical LayoutCache Cluster
Web Server A
IIS 7.x
Web Server B
IIS 7.xLoad
Balancer
Cache Server A
Cache Host
Cache Server B
Cache Host
Config
Store
(File share or
Sql Server)
• Configuration store contains cache policies and global partition map (how keys divide into regions, which servers have which regions)
• If Sql config store, servers will send heartbeat to Sql. Otherwise, heartbeat goes to one or more “lead hosts”
• Partition map used by “Global Partition Manager” (one node in the cluster, but auto failover) to communicate routing information to Velocity clients
Web Server C
IIS 7.x
Cache Host
Cache Server C
Cache Host
Sql Server)
Regions as unit of replication/failover
(Global Partition Manager in action)
Cache Cluster
Server A
Cache Host A
Server B
Cache Host B
Server C
Cache Host C
Named Cache: Product Catalog
Default Cache
Region: Sports
Region 1
Regions as unit of replication/failover
(When using Secondaries)
Cache Cluster
Server A
Cache Host A
Server B
Cache Host B
Server C
Cache Host C
Named Cache: Product Catalog
Default Cache
Region: Sports
Region 1
Sports secondary
Region 1 secondary
(Updates done synchronously)
Local CacheCache Cluster
Web Server A
IIS 7.x
Web Server B
IIS 7.xLoad
Balancer
Cache Server A
Cache Host
Cache Server B
Cache Host
Local
Cache
Local
Cache
• Local cache is an option that can be enabled when creating the cache client (DataCacheFactory)
• Allows a local cache to be populated that will prevent network hop (and serialization) if request
can be satisfied locally
• Best when data set is (relatively) small, changes infrequently, and stale data is acceptable
• Can expire via TTL or notifications (which might be late/lost)
• Can specify max object count before evicting LRU
Web Server C
IIS 7.xCache Server C
Cache HostLocal
Cache
Data Types and Caching
Considerations• Reference Data: Product catalogs, “lookup” tables, other
slow-moving content– Safe to cache for a defined period of time because you probably
live with staleness already
– “Local” cache option might be desirable for small data sets
• Activity Data: Shopping carts or other transient transaction • Activity Data: Shopping carts or other transient transaction state– Accessed for read and write operations, but not shared.
Low/No concurrency considerations – exclusive write.
– Safe to cache for reads and keep in cache for writes
• Resource Data: Inventory, Orders, and other core transactional data– Accessed concurrently for read and write
– Caching will require a concurrency model to be chosen and managed
What we’ll cover
• What motivates this product/technology
• Terms / Pictures / Concepts
• Deploy / Install Process
• A lap around the API & Admin model• A lap around the API & Admin model
• Demos
• Gotchyas
Deploy/Install Considerations
• Windows “Application Server” Role required
• Hotfix required for Vista/Win2k8; not for Win7/Win2k8R2
• You’ll need Powershell 2 (already in Win7/Win2k8R2)
• You’ll need Powershell 2 (already in Win7/Win2k8R2)
• .NET3.5SP1 for cache clients; .NET4 for servers
• Windows XP cannot be a client…
• “Install” and “Configure” for AppFabric are two distinct steps (much like BizTalk)
Deploy/Install Considerations
• Primary screen of
interest is choosing your
configuration store:
– XML/File share
– Sql-Based
• File share avoids the
need for Sql Server, but
requires that some requires that some
nodes in the cache
cluster be special (“Lead
Hosts”)
• Using Sql as the
configuration store is
the better engineering
choice for production –
you may have other
reasons to avoid it.
Deploy/Install Considerations
• As you build out your Velocity Cache Cluster,
you will do “New Cluster” on the first node,
and “Join Cluster” on subsequent nodes
• Ultimately, all of Windows Server AppFabric is • Ultimately, all of Windows Server AppFabric is
a set of features underneath the Application
Server Role – so standard command line
installations work.– Setup.exe /i CacheAdmin,CacheService,CacheClient
AppFabric as Application Server
“Role Service”
Deploy/Install Considerations
• Can do a “Cache client” install for clients, or
for internal apps, just incorporate client
assemblies in your own build/deploy processMicrosoft.ApplicationServer.Caching.Core.dll
Microsoft.ApplicationServer.Caching.Client.dllMicrosoft.ApplicationServer.Caching.Client.dll
Microsoft.WindowsFabric.Common.dll
Microsoft.WindowsFabric.Data.Common.dll
What we’ll cover
• What motivates this product/technology
• Terms / Pictures / Concepts
• Deploy / Install Process
• A lap around the API & Admin model• A lap around the API & Admin model
• Demos
• Gotchyas
Caching Classes
DataCacheFactory
DataCacheFactory()
DataCacheFactory(configuration)
DataCache GetCache(string cache)
GetDefaultCache()
DataCache
Add
Adds a new object to the
cache. Exception if the item
is already in the cache.
DataCacheFactoryConfiguration
LocalCacheProperties
NotificationProperties
SecurityProperties
DataCacheServerEndpoint[] Servers
(Can set these via configuration)
is already in the cache.
Put
Adds a new object to the
cache. Replaces if already in
cache.
GetReturns an object from the
cache.
RemoveRemoves an object from the
cache.
Caching Classes
DataCache with DataCacheItemVersion
• GetCacheItem: returns tags and version info
• GetIfNewer: lets you use that version info!
• Put and Remove have overloads that takes
version infoversion info
– Allows for an optimistic concurrency model
– Will only succeed if version information matches
what is current for the cached item
DataCache and Locking
• GetAndLock: Allows you to lock a cache item
for a specified time period, even if not present
– (Will fail if already locked)– public Object GetAndLock (string key, TimeSpan timeout, – public Object GetAndLock (string key, TimeSpan timeout,
out DataCacheLockHandle lockHandle, bool forceLock)
• PutAndUnlock: Unlock an item, with given key
and lock handle
• Unlock: Explicitly unlock, optional extend TTL
DataCache and Tags/Regions
• Explicitly created regions live on a single
node…can create a hot spot for both call
volume and memory growth
• But they offer bulk retrieval and flexible tag-• But they offer bulk retrieval and flexible tag-
based retrieves
• Instead of regions: can simulate secondary
indexes with your own secondary-to-primary
mapping
Administrative Model
• Administration for AppFabric Caching done purely through PowerShell
• Can administrate entire Cache Cluster from wherever administrative portion of install has wherever administrative portion of install has been done – all nodes addressable from single command line location
• Use-CacheCluster points the shell at a particular cluster to administrate
• Remember: Get-CacheHelp ☺
What we’ll cover
• What motivates this product/technology
• Terms / Pictures / Concepts
• Deploy / Install Process
• A lap around the API & Admin model• A lap around the API & Admin model
• Demos
• Gotchyas
What we’ll cover
• What motivates this product/technology
• Terms / Pictures / Concepts
• Deploy / Install Process
• A lap around the API & Admin model• A lap around the API & Admin model
• Demos
• Gotchyas
Gotchyas
• Not a gotchya: AppFabric provides a SessionStoreProvider class that plugs into the ASP.NET session storage provider model
• Balance number of nodes in cluster with memory per node. – Too many nodes = cluster overhead, too much memory per node = GC
overhead
• If you don’t use Sql Config Store, you need to manually run Start-CacheHost after rebootCacheHost after reboot
• Sql Config Store requires high Sql privileges right now at point of install
• Currently service runs as network service account
• Consider what you will do when cache is down– You can go after source of truth
– How do you avoid leaving stale data in the cache?
Thank you -
Questions?