process control daemon
DESCRIPTION
TRANSCRIPT
![Page 1: Process control daemon](https://reader033.vdocuments.mx/reader033/viewer/2022061210/548eb952b47959813b8b4796/html5/thumbnails/1.jpg)
Process Control DaemonFor Embedded Linux Platforms
Speaker: Hai Shalom
rt-embedded.com/pcd
![Page 2: Process control daemon](https://reader033.vdocuments.mx/reader033/viewer/2022061210/548eb952b47959813b8b4796/html5/thumbnails/2.jpg)
Background review: What were the reasons that led to the development of PCD.
PCD project review: Features and high level overview of the project.
Live demonstration. Q & A.
Agenda
![Page 3: Process control daemon](https://reader033.vdocuments.mx/reader033/viewer/2022061210/548eb952b47959813b8b4796/html5/thumbnails/3.jpg)
Does your product have a process controller? Does your product automatically recover after a
crash? Do you think your product’s boot time is fast
enough? Are you using methods other than printf to debug
a crashed application? Are you familiar with all the processes which are
running in your product and their dependencies?
Some questions
![Page 4: Process control daemon](https://reader033.vdocuments.mx/reader033/viewer/2022061210/548eb952b47959813b8b4796/html5/thumbnails/4.jpg)
Most of you probably answered “No” to at least one question.
People who answered “Yes” to all questions are probably using PCD already!
Let’s review some facts about Embedded Linux based products…
What were your answers?
![Page 5: Process control daemon](https://reader033.vdocuments.mx/reader033/viewer/2022061210/548eb952b47959813b8b4796/html5/thumbnails/5.jpg)
Done by scripts (rcS, rc.*). These are great, but might be: Not optimal for embedded / not deterministic:
Limited ways to synchronize depended processes (delay). Limited ways to verify successful start of a process No error checking (usually). No formal way to define dependencies. Difficult to start processes in parallel.
Not trivial to understand, maintain and extend: Require additional shell scripting expertise. Tend to be long and unreadable. Plenty of commented code, old remarks, different code styles.
System start up
![Page 6: Process control daemon](https://reader033.vdocuments.mx/reader033/viewer/2022061210/548eb952b47959813b8b4796/html5/thumbnails/6.jpg)
Done by scripts (rcS, rc.*). These are great, but might be: Not optimal for embedded / not deterministic:
Limited ways to synchronize depended processes (delay). Limited ways to verify successful start of a process No error checking (usually). No formal way to define dependencies. Difficult to start processes in parallel.
Not trivial to understand, maintain and extend: Require additional shell scripting expertise. Tend to be long and unreadable. Plenty of commented code, old remarks, different code styles.
System start up
Looks familiar?
![Page 7: Process control daemon](https://reader033.vdocuments.mx/reader033/viewer/2022061210/548eb952b47959813b8b4796/html5/thumbnails/7.jpg)
A crashed program just terminates, usually after printing “Segmentation Fault”. Now what? Where is the debug information? Kernel crashes are assumed to be handled by the system’s
watchdog. Signal Handlers not always implemented correctly.
Unsafe to use printf, and many other functions. The system remains unstable and unusable.
End user must power-cycle (again?).
Crash handling and recovery
![Page 8: Process control daemon](https://reader033.vdocuments.mx/reader033/viewer/2022061210/548eb952b47959813b8b4796/html5/thumbnails/8.jpg)
A crashed program just terminates, usually after printing “Segmentation Fault”. Now what? Where is the debug information? Kernel crashes are assumed to be handled by the system’s
watchdog. Signal Handlers not always implemented correctly.
Unsafe to use printf, and many other functions. The system remains unstable and unusable.
End user must power-cycle (again?).
Crash handling and recovery
![Page 9: Process control daemon](https://reader033.vdocuments.mx/reader033/viewer/2022061210/548eb952b47959813b8b4796/html5/thumbnails/9.jpg)
No central management entity. init is the parent of all processes.
Must know process’ pid in order to signal or kill. Each process must manage his own children.
Child process inherits his father’s priority. Parents must retrieve child’s
exit status, or else we end upwith Zombies…
Process management
![Page 10: Process control daemon](https://reader033.vdocuments.mx/reader033/viewer/2022061210/548eb952b47959813b8b4796/html5/thumbnails/10.jpg)
A customer reports a crash in the field or in his lab tests: There is no standard method for generating and collecting
remote debug information. When a process abnormally terminates, all its
information goes away and no log is saved. You might be on the next flight to the customer’s lab.
Field/Remote debugging
![Page 11: Process control daemon](https://reader033.vdocuments.mx/reader033/viewer/2022061210/548eb952b47959813b8b4796/html5/thumbnails/11.jpg)
What is PCD?
A great (and free) solution: PCD
![Page 12: Process control daemon](https://reader033.vdocuments.mx/reader033/viewer/2022061210/548eb952b47959813b8b4796/html5/thumbnails/12.jpg)
PCD – Process Control Daemon, is an open source, light-weight system level process manager for Embedded-Linux based products (consumer electronics, network devices, etc).
The PCD provides a complementary service for any Embedded Linux driven product.
Designed and implemented by Hai Shalom during employment at Texas Instruments for Next-Gen Puma5 Cable chipset.
Released to open source as part of his M.Sc. Degree research. PCD is a proven solution that already drives millions of devices
in the world.
What is PCD?
![Page 13: Process control daemon](https://reader033.vdocuments.mx/reader033/viewer/2022061210/548eb952b47959813b8b4796/html5/thumbnails/13.jpg)
System startup: PCD starts up the system in an efficient, synchronized and deterministic manner.
Process management: a centralized entity that controls and monitors all processes, and provides API to manage them.
System recovery: Configurable per process recovery action is taken in case of a crash.
Debug information: PCD provides a detailed crash log in case of a program error.
PCD Features in high-level
![Page 14: Process control daemon](https://reader033.vdocuments.mx/reader033/viewer/2022061210/548eb952b47959813b8b4796/html5/thumbnails/14.jpg)
What are the advantages of products with PCD?
How does it work?
![Page 15: Process control daemon](https://reader033.vdocuments.mx/reader033/viewer/2022061210/548eb952b47959813b8b4796/html5/thumbnails/15.jpg)
Rule blocks replace/extend traditional shell scripts. Each rule defines a single process. Rule inter-dependency is well defined.
PCD Scripts: Rule blocks
Process 1
Process 2
Process 3
Rule 1
Rule 2
Rule 3
PCDScriptFile
![Page 16: Process control daemon](https://reader033.vdocuments.mx/reader033/viewer/2022061210/548eb952b47959813b8b4796/html5/thumbnails/16.jpg)
Very simple and readable syntax. Easy to extend and maintain. Each Rule block is based on the same template and
contains the following details: What is the process name and parameters? When to start it (depends on event…)? What is the required priority? What is the completion event? How much time to wait for it to complete? What to do in case of a crash?
PCD Scripts: Rule blocks
![Page 17: Process control daemon](https://reader033.vdocuments.mx/reader033/viewer/2022061210/548eb952b47959813b8b4796/html5/thumbnails/17.jpg)
Very simple and readable syntax. Easy to extend and maintain. Each Rule block is based on the same template and
contains the following details: What is the process name and parameters? When to start it (depends on event…)? What is the required priority? What is the completion event? How much time to wait for it to complete? What to do in case of a crash?
PCD Scripts: Rule blocks
![Page 18: Process control daemon](https://reader033.vdocuments.mx/reader033/viewer/2022061210/548eb952b47959813b8b4796/html5/thumbnails/18.jpg)
Once all rules are parsed, the PCD builds a dependency graph database.
PCD starts each rule in the “right” time.
PCD continuously monitors the system.
Event Driven System Startup
PCD Rule
Rule
RuleRuleRule
RuleRule
Rule
Rule
Rule
Last
![Page 19: Process control daemon](https://reader033.vdocuments.mx/reader033/viewer/2022061210/548eb952b47959813b8b4796/html5/thumbnails/19.jpg)
Right time when a Start event occurred: Another rule or set of rules have completed successfully, or; A resource has been created (Network device, file).
Completion event when the attached process: Has exited with the correct status, or; Sent a “Process ready” event to the PCD, or; Created a resource, or; Was running for a specified amount of time, or; Was created.
A Completion event of one rule could be the Start event of another rule.
Event Driven System Startup
![Page 20: Process control daemon](https://reader033.vdocuments.mx/reader033/viewer/2022061210/548eb952b47959813b8b4796/html5/thumbnails/20.jpg)
Dependencies between processes are well defined. Rules are started as soon as their start event comes. No need for non-deterministic delays between
starting processes. Rules without inter-dependency are started in
parallel. Improve user experience and product reputation
(Fast product!)
Reduced startup time
![Page 21: Process control daemon](https://reader033.vdocuments.mx/reader033/viewer/2022061210/548eb952b47959813b8b4796/html5/thumbnails/21.jpg)
Enhanced stability and robustness
Crash
Rule PCD
SignalProcess
Restart
Reboot
Recover
Ignore Rule
![Page 22: Process control daemon](https://reader033.vdocuments.mx/reader033/viewer/2022061210/548eb952b47959813b8b4796/html5/thumbnails/22.jpg)
Enhanced monitoring on processes and recovery in case of failure.
Each Rule defines what to do in case its process crashes: Restart the process: Usually for non-critical services such as
a web server, or processes that can recover by restarting themselves.
Reboot the system: In case of a fatal, non-recoverable error. Initiate a recovery rule. Ignore: Similar behavior without PCD.
Enhanced stability and robustness
![Page 23: Process control daemon](https://reader033.vdocuments.mx/reader033/viewer/2022061210/548eb952b47959813b8b4796/html5/thumbnails/23.jpg)
Enhanced debugging capabilities
Crash
Rule PCD
PCDAPI
Signal
Prepare and send
exception info
Process
Detailed CrashLog
Log in NVRAM
![Page 24: Process control daemon](https://reader033.vdocuments.mx/reader033/viewer/2022061210/548eb952b47959813b8b4796/html5/thumbnails/24.jpg)
The PCD exception handlers will catch and handle any fault exception (Signals).
The PCD will provide useful debug information. The information speeds up the error fixing cycle and
improves product robustness. Error logs are saved in non-volatile memory
Can be used for offline analysis after a validation cycle in the lab.
Can be used for post-mortem analysis of units from the field.
Enhanced debugging capabilities
![Page 25: Process control daemon](https://reader033.vdocuments.mx/reader033/viewer/2022061210/548eb952b47959813b8b4796/html5/thumbnails/25.jpg)
Snapshot taken from an ARM platform.
Contains: Signal info Registers Map file
Registers pc and lr/ra can be used to trace the bug using addr2line or objdump.
Crash log with PCD
![Page 26: Process control daemon](https://reader033.vdocuments.mx/reader033/viewer/2022061210/548eb952b47959813b8b4796/html5/thumbnails/26.jpg)
Process management
Rule 1
PCD
Request to restart Process 2
Process 1
Rule 4
New Configuration
Rule 2
Restart Process 2
Rule 3
Process 3 Process 4
User input: Disable
something
Process 2
Request to terminateProcess 4
TerminateProcess 4
Process 2
![Page 27: Process control daemon](https://reader033.vdocuments.mx/reader033/viewer/2022061210/548eb952b47959813b8b4796/html5/thumbnails/27.jpg)
Process management with PCD API: Start or terminate a process. Send a “process ready” event. Signal a process. Register to exception handlers. Reboot the system (with logged a reason).
The PCD API is available by linking with the PCD library.
Process management
![Page 28: Process control daemon](https://reader033.vdocuments.mx/reader033/viewer/2022061210/548eb952b47959813b8b4796/html5/thumbnails/28.jpg)
What is the order that the processes are started?
What is each process dependency? PCD can generate dependency graphs
for visual representation of all the rules and their dependencies.
Visibility provides an excellent means to examine and understand the dependencies between each rule in the system, and fix them in case of mistakes.
Dependency graph generation
![Page 29: Process control daemon](https://reader033.vdocuments.mx/reader033/viewer/2022061210/548eb952b47959813b8b4796/html5/thumbnails/29.jpg)
PCD is architecture agnostic, except for the crash log code that displays register details.
Up to date, the following platforms are supported: ARM (primary development target). MIPS (secondary development target). x86 x64
For other platforms, the crash log will not include register details.
Last two architectures allow running a PCD driven system in any development PC running Linux.
Supported architectures
![Page 30: Process control daemon](https://reader033.vdocuments.mx/reader033/viewer/2022061210/548eb952b47959813b8b4796/html5/thumbnails/30.jpg)
PCD is a light-weight process controller for embedded platforms.
Here are its modest memory requirements: PCD Code: 28KB PCD Data section: 4KB PCD Heap: 36KB (Typical). PCD Stack (Watermark): 84KB (Typical).
Memory Requirements
![Page 31: Process control daemon](https://reader033.vdocuments.mx/reader033/viewer/2022061210/548eb952b47959813b8b4796/html5/thumbnails/31.jpg)
The PCD Project is an Open-Source project. The PCD project is licensed under the GNU Lesser
General Public License version 2.1, as published by the Free Software Foundation.
Its license allows linking proprietary software without any license contamination.
To view a copy of this license, visit http://www.gnu.org/licenses/lgpl-2.1.html#SEC1
Licensing
![Page 32: Process control daemon](https://reader033.vdocuments.mx/reader033/viewer/2022061210/548eb952b47959813b8b4796/html5/thumbnails/32.jpg)
PCD improved the Puma5 products in various aspects: Startup time: The system boots much more quickly
comparing to scripts (15 seconds faster). Robustness, availability: Due to the recovery actions, the
system is more available and user experience is better. Quality: Detailed crash logs pointed out bugs, reduced
fix time, enabled remote and offline analysis. Customers found it very useful:
Added new rule blocks with their own modifications.
PCD contribution to product success
![Page 33: Process control daemon](https://reader033.vdocuments.mx/reader033/viewer/2022061210/548eb952b47959813b8b4796/html5/thumbnails/33.jpg)
PCD Home page (Hai’s Real-Time Embedded blog): http://www.rt-embedded.com/pcd
Project management and source code at SourceForge: http://sourceforge.net/projects/pcd/
PCD Documentation and user guides (Yes! There is some): http://www.rt-embedded.com/blog/pcd-process-control-daemon/pcd-documentation/
PCD support forum: http://sourceforge.net/projects/pcd/support
New software engineers are welcomed to join the project and contribute.
PCD Resources
![Page 34: Process control daemon](https://reader033.vdocuments.mx/reader033/viewer/2022061210/548eb952b47959813b8b4796/html5/thumbnails/34.jpg)
Questions and Answers
Hai, fixing a bug
![Page 35: Process control daemon](https://reader033.vdocuments.mx/reader033/viewer/2022061210/548eb952b47959813b8b4796/html5/thumbnails/35.jpg)
Wrap Up
![Page 36: Process control daemon](https://reader033.vdocuments.mx/reader033/viewer/2022061210/548eb952b47959813b8b4796/html5/thumbnails/36.jpg)
System startup: PCD starts up the system in an efficient, synchronized and deterministic manner.
Process management: a centralized entity that controls and monitors all processes, and provides API to manage them.
System recovery: Configurable per process recovery action is taken in case of a crash.
Debug information: PCD provides a detailed crash log in case of a program error.
PCD can make your product a better product!
PCD Features in high-level
![Page 37: Process control daemon](https://reader033.vdocuments.mx/reader033/viewer/2022061210/548eb952b47959813b8b4796/html5/thumbnails/37.jpg)
Thank you!
Hai Shalom
![Page 38: Process control daemon](https://reader033.vdocuments.mx/reader033/viewer/2022061210/548eb952b47959813b8b4796/html5/thumbnails/38.jpg)
Backup slides
![Page 39: Process control daemon](https://reader033.vdocuments.mx/reader033/viewer/2022061210/548eb952b47959813b8b4796/html5/thumbnails/39.jpg)
PCD High level technical info
![Page 40: Process control daemon](https://reader033.vdocuments.mx/reader033/viewer/2022061210/548eb952b47959813b8b4796/html5/thumbnails/40.jpg)
The PCD API provides an easy interface to request various services from the PCD:◦ Start or terminate a process.◦ Send a “process ready” event.◦ Signal a process.◦ Register to PCD default exception handlers.◦ Reboot the system (with logged a reason).
The PCD API is available by linking with the PCD library.
Standard API for PCD services
![Page 41: Process control daemon](https://reader033.vdocuments.mx/reader033/viewer/2022061210/548eb952b47959813b8b4796/html5/thumbnails/41.jpg)
Every program can register to PCD’s exception handlers.
The PCD performs as a “crash daemon” which listens on a dedicated socket.
The exception handler collects debug information and sends it to the PCD using only “Safe functions”.
The PCD formats the data, displays it on the console and logs it in the non-volatile storage.
PCD Exception handler
![Page 42: Process control daemon](https://reader033.vdocuments.mx/reader033/viewer/2022061210/548eb952b47959813b8b4796/html5/thumbnails/42.jpg)
The PCD design features various loosely coupled software modules:◦ Main: Performs the initializations and the main loop.◦ Rule Parser: Reads and parses the textual rules.◦ Rules DB: Stores all the rules as binary records.◦ Process: Starts, stops and monitors the processes◦ Timer: Provides the ticks for the pcd.◦ Condition check: Checks if a condition is satisfied.◦ Failure action: Performs failure/recovery actions.◦ Exception: Implements the detailed exception handlers.◦ API: The PCD API interface (As a separate library).
PCD Software modules
![Page 43: Process control daemon](https://reader033.vdocuments.mx/reader033/viewer/2022061210/548eb952b47959813b8b4796/html5/thumbnails/43.jpg)
PCD Software modules block diagramPARSER
MAIN
RULESDB
Textual configuration file
with rules
Activate Rules
Parse Rules File
Add RuleRule Info
Activate /Stop
TIMER
FAILUREACTION
PROCESS CONTRL
CONDITIONCHECK
Activate failure action
Activate Rule
Tick
CheckCondition
OK / NOK Enqueue Process
EnqueueRule
Iterate
OK/Fail
OK/Fail
Process
Spawn / Signal /Monitor
Stopped / Signaled / Exited
PCD API
IPC
Check Messages
Enqueue /Dequeue
Rule
Application
EXCEPTION HANDLER
Crashed
Activate failure action
![Page 44: Process control daemon](https://reader033.vdocuments.mx/reader033/viewer/2022061210/548eb952b47959813b8b4796/html5/thumbnails/44.jpg)
A textual file, similar to shell script syntax. Contains a list of “Rule Blocks”. A Rule block is defined per process. Scripts can be extended by including other scripts.
◦ dividing dedicated scripts per each logical or functional sub-system in the system.
PCD Rules Script
![Page 45: Process control daemon](https://reader033.vdocuments.mx/reader033/viewer/2022061210/548eb952b47959813b8b4796/html5/thumbnails/45.jpg)
Rules and Processes block diagram
Rule
Rule
Rule
Process
Process
Process
Associated
Associated
Associated
RULESDB
Depends
Depends
PROCESS CONTROL
Started, Stopped, Monitored
Started, Stopped, Monitored
Started, Stopped, Monitored
PCD Script
RuleRuleRule…Rule
PARSER
ReadAdd Rule
![Page 46: Process control daemon](https://reader033.vdocuments.mx/reader033/viewer/2022061210/548eb952b47959813b8b4796/html5/thumbnails/46.jpg)
PCD Rules Script Syntax
![Page 47: Process control daemon](https://reader033.vdocuments.mx/reader033/viewer/2022061210/548eb952b47959813b8b4796/html5/thumbnails/47.jpg)
The PCD provides a parser which provides an easy way to verify that your PCD scripts do not contain syntax errors, similarly to compilation process.
The parser allows to fix the configuration files on the host, without the need to run them on the target, and rebuilding an image in case of an error.
Syntax Checking
![Page 48: Process control daemon](https://reader033.vdocuments.mx/reader033/viewer/2022061210/548eb952b47959813b8b4796/html5/thumbnails/48.jpg)
No purchase costs or royalty fees. Source code is fully available. High quality code due to extensive exposure. LGPL allows linking proprietary code with PCD. Continuous development and bug fixes. Need a new feature?
◦ Either request it in the project tracker system◦ Or join the PCD community and develop it, so other could
also enjoy your productivity.
PCD - Open Source Benefits
![Page 49: Process control daemon](https://reader033.vdocuments.mx/reader033/viewer/2022061210/548eb952b47959813b8b4796/html5/thumbnails/49.jpg)
Support more platforms. Watchdog/Keep alive mechanism. Kernel monitoring agent/module. Rule enhancements:
◦Affinity◦Resource limitation (CPU, Heap, Stack, Fork Bombs..)◦Current working directory◦Others…
PCD – Wish list (Future Features)