publication and protection of site sensitive information in grids shreyas cholia [email protected]...

18
Publication and Protection of Site Sensitive Information in Grids Shreyas Cholia <[email protected] > NERSC Division, Lawrence Berkeley Lab Open Source Grid and Cluster Conference May 15th, 2008

Upload: mervyn-webster

Post on 12-Jan-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

  • Publication and Protection of Site Sensitive Information in Grids

    Shreyas Cholia NERSC Division, Lawrence Berkeley Lab

    Open Source Grid and Cluster ConferenceMay 15th, 2008

  • Information Collection in GridsTo create a successful and functional grid you need to collect information from sitesGrid infrastucture must publish collected information and make it available to interested partiesWe want to analyze the vectors of information collectionSystems publishing/collecting informationType of information being gatheredMethods of data protection applied to this information

  • Focus on Open Science GridWhat is the Open Science Grid?Virtual Facility providing distributed compute and storage resourcesComprised of VOs and their Users + Resource Providing Sites + OSG Infrastructure ProvidersBroad range of sites - small universities to large national labsMust have flexible infrastructure to meet diverse site/VO requirements Information collection and publishing coordinated by Grid Operations Center (GOC)NERSC/LBL heavily involved in OSGOur study started out as a recommendation report for OSG, but many of the results applicable to other grids

  • Information Being PublishedResource Selection Information Monitoring AccountingTroubleshootingLog FilesSite Availability informationSite Validation

  • Information Collection Systems in OSGGIP/CEMonGratiaSyslog-NGRSVsite_verifyMonalisaOthers?

  • CEMonPeriodically queries Compute Element statePublishes CE information as GIP attributesInformation made public through BDII and ReSS Condor Class-AdsUsed for resource selection queries

  • CEMon Sensitive InfoOperating System version infoUnderlying jobmanagerInternal System PathsAuthentication Method

    All this is necessary for a successful grid query BUT:Site must understand that info is publicMay want to restrict level of detail to avoid a Google hack

  • GratiaOSG Accounting SystemSites install local probes that report job/storage usage records to collectorInformation published through web interfaceWeb interface supports custom SQL queries

  • Gratia - Sensitive InfoUser DN and local account namesJob Information

    Risks:Users may consider job information private. If DN (or password) is compromised, it becomes very easy to discover other sites supporting the same DN.

  • Syslog-NGCollects grid log files at a central collectorCentralized Log CollectionTroubleshooting distributed grid workflowsSecurity Incident Response eg. where was a compromised DN usedQueriable database backendTiered architecture

  • Syslog-NG RisksLog files are sensitive! Most sites want to limit access to these.Internal system info - may expose vulnerabilitiesDetailed user, software info and failure modesMay not want to make these available to grid infrastructure providersNo longer under site controlTiered architecture design allows sites to set up local collectors that can filter and forward limited information to the grid

  • MonalisaPublishes resource availabilityload informationPerformance informationPublic web interface. Very useful for querying the state of the grid at a high level.

    BUTMay be used to target overloaded sites for DoS

  • RSV and Site-VerifyProbes run by grid infrastructure to verify site capabilities and report on site availabilityVerifies information published by CEMonPublishes results online (VORS)

    Risks:Same risks as CEMon infoAdditionally, historical data is available - may be able to trace downtimes (when system is in transitional state).

  • Summary of Sensitive InfoAccount Names, User DNs (VORS, Gratia)Failure Modes, Security Related Details (Syslog-NG)Historical System Availability (VORS)System Load (MonALISA)Application Names, Internal Paths (Gratia, CEMon)Software Levels (CEMon)

  • Security RisksGratia - public interface to Gratia DBTrack user activity on a siteRival project can discover job informationIn case of compromised cert/account, query DB for other sites with same accountSyslog-NGInternal failure modes, other logging details available to non-site personnelSecurity incident details no longer privateCEMon/VORSList of valid user accounts, DNs made publicSoftware levels, Authn method public - possible Google HackHistorical archive of system info (may be able to target recurring downtimes)MonALISASystem Load Info - DoS attack during high load

  • Data Protection (for Sites)Turn down logging in Syslog-NG to minimal levelStart-stop times, User DN infoIncrease level for troubleshootingCustomize probes to meet site requirements.Only publish necessary informationBe AWARE of what is going out!!Modify GIP attributesOverride or Modify attributes as necessaryMask sensitive dataUse generic VO names instead of local account namesSite level collectorsReview and filter, before forwarding to OSGChoose secure/encrypted publication channels

  • Suggestions for OSGAuthenticated access to information servicesUse GSI certs within browser to authenticate userLimit access based on VO Consolidate services where possible Minimize information streams publishing the same dataTeragrid INCA as a model?Use encrypted SSL based communication for ALL information streamshttps, GSI etc.Use robots.txt to prevent web cachingAuthenticate probes using GSI hostcerts to prevent bogus information.

  • ConclusionsNot a replacement for hard security policiesMust fix and patch software regularlyInternally monitor systemsSites should have more flexibility and control over published informationOSG should consider limiting public access to user/VO based access

    ********