pysmbc python c modules are easy

Download Pysmbc Python C Modules are Easy

If you can't read please download the document

Upload: roberto-polli

Post on 16-Apr-2017

1.517 views

Category:

Technology


3 download

TRANSCRIPT

Slide 1

PySmbC:
C Modules are Easy

EuroPython 2012, 6th July - Firenze

Babel Srl P.zza S. Benedetto da Norcia, 33 0040, Pomezia (RM) www.babel.it

What? Who? Why?

A story about how easy is to contribute to a Python project using GitHub and Nose testing framework.

Roberto Polli - Community Manager @ Babel.it. Loves writing in C, Java and Python. Red Hat Certified Engineer.

Fabio Isgr System Engineer @ Babel.it. Linux and Samba expert.

Babel Proud sponsor of this talk ;) Delivers large mail infrastructure based on Open Source software for Italian ISP and PA. Contributes to various FLOSS.

Hi everybody, I'm Roberto Polli from Babel and I'm going to tell you a python story.

A story about how easy is to contribute to a python project even if you're a py-noob like me, if you just use all the tools that the FLOSS world gives you.

Before starting I'd like to thank Babel the proud sponsor of this talk. - for letting me to play with this project even after I've finished the job.

Babel delivers large mail infrastructures for ISP, PA and GOV using and contributing to open source software. This story is just a drop in the sea of the various contribution of Babel to FLOSS.

Agenda - 1

Contributing to a Python C Extension is easier than thought

+ GitHub allows a fast way to fork and merge patches to open source projects

+ Write and run Unit and Functional tests is easy with NoseTest

= Don't have to be a Guru to support FLOSS!

People loves FLOSS mostly because it's free as in beer. Many companies think that contributing is costly and useless. Or simply because they don't have the required knowledge to mess with the code.

But today there are many technologies that should make them re-think about that, because the hardest part is to start.

With social coding platform like GitHub, and a clean testing process it's easy to write working code and contribute it back.

You don't have to be a guru to contribute!

Agenda - 2

SMB is the protocol used by Windows for sharing folders. Samba is its FLOSS implementation.

smbclient library supports almost all SMB features. PySmbC wraps some functions provided by this library.

To support extended permissions (ACL) we wrapped two more functions: get_xattr and set_xattr.

You don't need to know SMB to extend PySmbC

People loves FLOSS mostly because it's free as in beer. Many companies think that contributing is costly and useless. Or simply because they don't have the required knowledge to mess with the code.

But today there are many technologies that should make them re-think about that, because the hardest part is to start.

With social coding platform like GitHub, and a clean testing process it's easy to write working code and contribute it back.

You don't have to be a guru to contribute!

GitHub social coding

GitHub is a social coding platform.

Clone my repository and fork the projecthttps://github.com/ioggstream/pysmbc

Patch and commit to your repository

Push your changes to my repo

Discuss for approval

Merge my changes

Did you say...fork?

So, what's GitHub? Is there somebody here that doesn't know GitHub? Raise your hand!

The standard FLOSS development patter was based on check-out from main repository and send patches to mailing list.

GitHub one is based on fork.[read the slide]

Did you say fork???

GitHub social coding

Forking is the nightmare of every floss maintainer.

Sparse writes, increasing merge efforts, lost changes.

GitHub, like the I/O Scheduler, helps to merge writes! Tracks forks;

Push changes to master.

Fork is historically the nightmare of floss maintainer. It fragments the user base, and parcellize development efforts.

The 1st image shows various forks. Merging back the patches is hard, and even applying fixes from the master branch to the forks.

So fork childs are sparse writes, merge difficulties, lost changes.

GitHub acts like the Linux IO Scheduler, helping us to merge writes!It's done tracking forks, and allowing to review, push and merge changes from forks to master.

When I commit a possibily small change on my repo eg a fix I can push it to the master branch.

The maintainer can validate it eventually adding some more test and the apply. At this point I just have to merge my branch with the master one which contains now my patch!

GitHub social coding

Patch and commit to your repository

don't lose your changes;

track your own history.

Push your changes to my repo

no more for .patch;

I get the change history.

Discuss for approval

Github portal supports code annotations.

Merge my changes

use git to merge from a remote repository!

Here are some GitHub features:[read slide]

Enter PySmbC - Extensions

Python wrapper around libsmbclient: run C code from python enjoy a stable C library;

just manage input, output and errors:

usually faster than a pure python implementation.

libsmbclient.so.0smbc.so

rpolli$ ipython
ln [1]: import smbc

ln [2]: print smbc.XATTR_ACL
system.nt_sec_desc.acl
ln [3]:

GitHub was essential for our contribution. But first we had to understand how pysmbc work.

Linux smbclient library implements almost all functionalities. Is continuously maintained by the Samba team. Writing a pure python implementation of SMB client protocol was redundant!

Python has a nice feature: C Extension. It's a way to wrap existing C functions in a python module.

Main advantages are mentioned in the slide, while a drawback is that the given module is platform-dependant.

ipython is a python console supporting gnu-readline and many nice features. To use C Extension smbc.so that wraps the libsmbclient.so just set the PYTHONPATH and # import smbc

And enjoy all the features of smbc

Example - Wrapping factorial() - 1

The wrapping function my_factorial():Parses and validates the input;

Calls the wrapped function();

Returns a python object.

A given structure maps python methods to C functions.

Now we can invoke a wrapped function!

// Python C Extension
// uses factorial from fact.c
#include
// returns a python object!PyObject *my_factorial(...) {
...
ret = factorial(n);
...
return PyLong_asLong(ret);
}

# python scriptfrom _wrapper import factorial

print _wrapper.factorial(4)

// Maps _wrapper.factorial// to my_factorial

PyMethodDef BabelMethods[] = { {"factorial", my_factorial, ... }, {NULL, NULL, 0, NULL} /*Sentinel*/};wrapperfy.c

Here is a code example of wrapping a simple factorial() C function.

The C Extension is a C file named eg. wrapperfy.c

The wrapper function is my_factorial(), that parses the arguments received from python, calls the wrapped factorial() function, and return a Py_Object in this case is a Python_Long one.

To associate the my_factorial() function to a python method we need a map the MethodDef variable contained in wrapperfy.c.

Once built wrapperfy.c into _wrapper.so with gcc, we can play again with ipython:import _wrapper

_wrapper.factorial(4)

Example - Wrapping factorial() - 2

Parsing and validating Input and Output is fundamental. We don't want python to SEGFAULT!

Create new exceptions in the initialization function.

Throw exceptions in the function:setting PyErr;

returning NULL.


// returns a python object!PyObject *my_factorial(..., *args) {
// NULL indicates an error if (!PyArg_ParseTuple(args, "i", &n)) return NULL; // n! needs more than 8byte if (n>21) { ... PyErr_SetString(FactError, Bad value); } ...
return PyLong_asLong(ret);
}

PyObject *FactError;// in extension initialization......
init_wrapper(void) { ... // define an exception FactError = PyErr_NewException("_wrapper.error", NULL, NULL); ...}Writing C Extension we should be careful to avod memory leaks and segmentation faults.

All the memory management is up to you!

A good starting point is to validate memory areas and variable content, adding new exceptions to our module and throwing them when issues could raise.

This is done: properly setting the python error and stacktrace variables with PyErr_ functions that are similar to the errno C variable; Returning NULL in the function.

In our example, we created an Exception in the initialization function of the extention.Then we raise it using PyErr_SetString() in case of an invalid parameter in factorial()

Example - Wrapping factorial() - 3

C Extension components:wrapping functions;

method/function map;

exceptions;

initialization function.

Functions and Exception should be static

You have to track memory usage!

// Python C Extension
#include

// exceptionsPyObject *FactError;PyObject *FiboError;
// functionsPyObject *my_factorial(...);PyObject *my_fibonacci(...);

// Function/Method Maps

PyMethodDef BabelMethods[] = { {"factorial",my_factorial,... }, {"fibonacci",my_fibonacci,... }, {NULL, NULL, 0, NULL} /*Sentinel*/};PyMODINIT_FUNCinit_wrapper(void){ PyObject *m; m = Py_InitModule("_wrapper", BabelMethods); // Allocate Exceptions FactError = PyErr_NewException(...) FiboError = PyErr_NewException(...)}

Main sections of our wrapper modules: - wrapper functions - exceptions - function/method map - initialization function

All functions should be static statically mapped into the .so files and not usable outside that file.

And as I told above - you have to track memory usage!

Enters PySmbC - Modules

Python C extensions may enjoy both C and Python code.

Wrap the C extension in a Python module.

Extend the module with python classes.

$PYTHONPATH/

wrapper/

__init__.py

_wrapper.so

helpers.py

In [1]: import wrapperIn [2]: assert wrapper.helpersIn [3]: wrapper.helpers.is_integer(10)

We can even enjoy python code wrapping the extension in a module.

The convention is to prepend an underscore _ to the C extension library. wrapper.so becomes _wrapper.so In this way we can create a module named wrapper.

The slide shows the module directory structure.It's initialized by __init__.py which includes:all the _wrapper methods;methods contained in further .py files.

This could be a future asset of pysmbc, letting us to add further python methods to manage eg. xattr strings.

Nose let's contribute - 1

Before adding features to PySmbC we checked the project status# git clone https://github.com/ioggstream/pysmbc .# vim tests/settings.py # set samba credential# nosetests test/

NoseTest - a python script that auto-discovers and run test cases. Wraps python-unittest.

Add new features only after successful tests. Verify your environment (eg. Samba credentials, directory acls )

Now that we understood how C extensions and modules works, let's see how to get our job to be accepted by maintainers.

Once you clone (checkout) your project, you have to setup your environment in our case a Samba Server in a way that all tests are successful. So check the credentials, the workgroup and all the environment!

If you care about tests, the maintainer will care about your work!

Running tests on pysmbc is easy. Just run#nosetests against the test directory. Nose will browse the directory and discover all your test cases.

When all tests are ok, you can start developing!

Nose let's contribute - 2

On successful tests, we can start developing

Follow the Git way: create a separate branch. We'll merge it on success# git checkout -b ioggstream_setxattr

Write the tests before writing the code. You'll be more focused on your targets

With nosetest it's simpler than ever!

Before coding, branch your code tree and name it with a meaningful target of your changes.

I used ioggstream_setxattr, so that the maintainer could easily find who and what changes were involved.

Branching let us preserve the original tree and merge the master with the new changes made on the maintainer repository.

Writing tests before code is even simpler with Nose Test!

Nose is like UnitTest

UnitTestfrom unittest import TestCase, main

class MyTest(UnitTest):

def setUp(self):

printsetup every

def tearDown(self):

print teardown every

def test_do(self):

print do 1

if __name__== __main__:

main()

Noseimport nose

class MyTest:

def setup(self):

print setup

def teardown(self):

print teardown

def test_do(self):

print do 1

# nose script will auto-discover

# this script named test_script.py

Does everybody knows something about unit testing? The standard python library for unit testing is python-unittest.

It lets you write Test classes, with setup/teardown methods.

Nose works in a similar way. With the nosetests script you can save some bytes and avoid writing main(). And you can even run unittest programs.

Nose is simpler than UnitTest

Nose: simple test# don't need to import nose# or define a class

def setup():

printsetup once for all tests

def teardown():

print teardown once for all test

def test_do():

print do 1

def test_fail():

assert False

Nose: annotationsfrom nose import SkipTest,with_setup

def pre(): print setup

def post(): print teardown

@with_setup(pre,post)

def test_do():

print do 1

@SkipTest

def test_dont():

Print not done yet

Nose simplifies test writing. In many cases you don't even need to include nose. Just write your test methods, beginning with test_.

Moreover it supports basic fixtures like @SkipTest and @with_setup, to customize setup and teardown tests.

Nose Invocation

You can run your all tests in a given directory # nosetests ./path/

Or just one file

# nosetests ./path/test_sample.py

Or even a single test method

# nosetests ./path/test_sample.py:test_do1

Or suite, eventually setting the working directory

ex1# nosetests ./path/test_class.py:TestOne

ex2# nosetests -w ./path test_class:TestOne

For a verbose output just use:

#nosetests -sv [args]

Nosetests script is quite flexible.

You can run all tests, select a single file, class or even a single test.

Everything works via command-line and is scriptable.

By default nosetests silences tests writings to stdout. You can de-silence it with -s. And enable verbose output with -v.

PySmbC add getxattr

Nose ensures that we're not going to break anything.

Start writing tests, not coding functionalities.

You can @SkipTest until new functions are ready.

Play with the wrapped functions.

Start with the simpler one: getxattr()embed C constants into python;

test good values;

check behavior with bad values.

Code until tests are successful.

# from test_context.pydef test_xattr_constants(): '''reuse variable defined in smbclient.h''' assert smbc.XATTR_ACL assert smbc.XATTR_OWNER assert smbc.XATTR_GROUP

def test_xattr_get(): '''test xattr with all possible values''' . . . for xa in valid_xatts: assert ctx.getxattr(url, xa)

def test_xattr_get_error(): '''xattr_get should recognize bad values''' . . . for xa in invalid_xatts: try: ctx.getxattr(url, xa) assert False except RuntimeError as e: . . . #get errno assert errno == EINVAL

Nose ensures that we're not going to break anything.

The next step is to to write some test cases for the new functionalities.

Writing tests let us focus on the expected input and output, and require us to read the wrapped library documentation.

We started with the simpler one: get_xattr. T1 - check the existence of all the new constants T2 test get_xattr with invalid parameters T3 test get_xattr with valid parameters

PySmbC add setxattr and futures

Helper methods for parsing and creating ACL attrs_new = u'REVISION:1'\

+ ',OWNER:RPOLLI\\babel' \

+ ',GROUP:Unix Group\\babel' \

+ ',ACL:RPOLLI\\babel:0/0/0x001e01ff' \

+ ',ACL:Unix Group\\babel:0/0/0x00120089' \

+ ',ACL:Unix Group\\games:0/0/0x001e01ff' \

+ ',ACL:\\Everyone:0/0/0x00120089'

Shift from smbc.so to smbc module:smbc/_smbc.so

smbc/__init__.py

smbc/helper.py

# from test_context.py

def test_xattr_set(): . . . ctx.setxattr(url, a_name, attrs_new, REPLACE) attrs_1 = ctx.getxattr(url, a_name) assert attrs_1 == attrs_new

def test_xattr_set_error(): '''setxattr should recognize bad values''' . . . for xa in invalid_xatts: try: ctx.setxattr(url, a_name, xa, REPLACE) assert False except RuntimeError as e: . . . #get errno assert errno == EINVAL except TypeError pass

setxattr is not user friendly. You have to create ACL strings by hand.

The slide shows a simple ACL with a file accessible from two different groups.

Python is really good at handling strings, so I started to work on a python class to manage ACL.

To include it in pysmbc I had to reorganize pysmbc structure on my branch.

Test classes are still the same, living in the same path, so I'm sure I'm not going to break anything.

I'm currently discussing with the maintainer the new asset of the module.

Thank You!
[email protected]

PySmbC
https://github.com/ioggstream/pysmbc
http://pypi.python.org/pypi/pysmbc/
http://www.samba.org

Babel
http://www.babel.it
http://vaunaspada.babel.it/blog

That's all. Your questions are welcome!

I leave just few links, if you're interested in contributing to PySmbC or just want to find some useful posts on Linux and Coding on Babel company blog vaunaspada.

Babel Srl P.zza S. Benedetto da Norcia, 33 0040, Pomezia (RM) www.babel,it

Roberto Polli - [email protected]