poster hadoop summit 2011: pig embedding in scripting languages

1

PIG Scripting Making Pig Turing-complete through embedding in a scripting language Julien Le Dem - Yahoo Solution 2: using Pig scripting Embedded Pig calls in Python Python functions are automatically available as UDFs Python variables can be used in the pig scripts $n, $i, ... UDFs use standard Python constructs (tuple/list/dictionary ) automatically converted to Pig. UDFs take the elements of the tuple as parameters, not a tuple. The output schema is specified using a decorator (it can be a function if you need to manipulate the input schema) Unfortunately this solution does not fit here. It requires 7 artifacts: - 3 Java UDFs: 3 classes must be compiled and packaged in a jar. The average UDF size is 50 to 100 lines of code. - 3 Pig scripts in 3 separate files: init, main loop content, finalization. The average Pig script size is 5 to 10 lines. - Main program: executes and coordinates the Pig scripts. It contains about 50 lines of code. 1 file required: Modification flow: Modification flow: Overview Example: Transitive closure - Iterative process: requires a loop and a termination condition - Requires multiple join/group by: typical Pig usage - Requires User Defined Functions - Map/Reduce: a solution to process big data. - Pig: makes it easier to manipulate data on the grid. - Pig scripting: makes Pig easier with iterative algorithms and User Defined Functions. PIG-928 (in 0.8) Solution 1: plain Pig Pig execution model 7 files required: This solution does fit on the poster. It requires 1 artifact, containing: - 3 UDFs provided as functions: The average UDF size is 8 to 12 lines of code. - 3 Pig queries: init, main loop content, finalization. Each Pig query adds an average 5 to 10 lines. - Main function: executes and coordinates the Pig scripts. It adds about 10 lines to Access to the output of pig scripts

Upload: julien-le-dem

Post on 29-Jun-2015

1.026 views

Category:

Technology

0 download

Report

Download

Tags:

Embed Size (px):

DESCRIPTION

Poster presented at Hadoop summit 2011

TRANSCRIPT

Page 1: Poster Hadoop summit 2011: pig embedding in scripting languages

PIG ScriptingMaking Pig Turing-complete through embedding in a scripting language

Julien Le Dem - Yahoo

Solution 2: using Pig scripting

Embedded Pig calls in Python

Python functions are automatically available as UDFs

Python variables can be used in the pig scripts $n, $i, ...

UDFs use standard Python constructs (tuple/list/dictionary) automatically converted to Pig.

UDFs take the elements of the tuple as parameters, not a tuple.

The output schema is specified using a decorator (it can be a function if you need to manipulate the input schema)

Unfortunately this solution does not fit here.

It requires 7 artifacts:

- 3 Java UDFs: 3 classes must be compiled and packaged in a jar. The average UDF size is 50 to 100 lines of code.

- 3 Pig scripts in 3 separate files: init, main loop content, finalization. The average Pig script size is 5 to 10 lines.

- Main program: executes and coordinates the Pig scripts. It contains about 50 lines of code.

1 file required:

Modification flow: Modification flow:

OverviewExample: Transitive closure

- Iterative process: requires a loop and a termination condition

- Requires multiple join/group by: typical Pig usage

- Requires User Defined Functions

- Map/Reduce: a solution to process big data.

- Pig: makes it easier to manipulate data on the grid.

- Pig scripting: makes Pig easier with iterative algorithms and User Defined Functions.

Jiras: PIG-1479, PIG-1794 (in 0.9), PIG-928 (in 0.8)

Solution 1: plain Pig

Pig execution model

7 files required:

This solution does fit on the poster.

It requires 1 artifact, containing:

- 3 UDFs provided as functions: The average UDF size is 8 to 12 lines of code.

- 3 Pig queries: init, main loop content, finalization. Each Pig query adds an average 5 to 10 lines.

- Main function: executes and coordinates the Pig scripts. It adds about 10 lines to the script.

Access to the output of pig scripts

Adobe Photoshop CC Scripting · PDF file8 2 Photoshop Scripting Basics This chapter provides an overview of scripting for Photoshop, describes scripting support for the scripting languages

Positioning, Campaigns & 2.0 Launch...•Apache Hive –Most popular SQL-like interface for data in Hadoop •Apache Pig –Scripting language used in some of the largest Hadoop installations

Adobe Intro to Scripting Intro to Scripting

Db2 Scripting With Windows Scripting Host

Dachis Group Pig Hackday: Pig 202

Apache Big Data Europe 2016 Power Pig with Spark...Apache Pig Procedural scripting language Pig Latin: similar to sql Heavily used for ETL Schema / No schema data, Pig eats everything

Scripting Language and Scripting Engine Programming Guideoceanoptics.com/wp-content/uploads/JSL-Programming-Manual.pdf · Scripting Language and Scripting Engine Programming Guide

HTML Scripting - €¦ · Dynamisches HTML Client Side Scripting (DOM) Server Side Scripting (PHP, Perl, Python, . . .) Scripting Inf1NF 3

Introduction to Linux Scripting (Part 2): Advanced Scripting

Adobe Photoshop CC 2014 Scripting · PDF file8 2 Photoshop Scripting Basics This chapter provides an overview of scripting for Photoshop, describes scripting support for the scripting

Scripting Languages. Client Side Scripting Languages Server side Scripting Languages

Client side scripting and server side scripting

APIs for Scripting - 2014 - Perforce · APIs for Scripting APIs for Scripting v Instance Methods ..... 32

Developing Pig on Tez - events.static.linuxfound.org...Developing Pig on Tez Mark Wagner Committer, Apache Pig LinkedIn Cheolsoo Park VP, Apache Pig Netflix. What is Pig Apache project

Open Basic Interpreter User Guide Version 1obasic.sourceforge.net/read_eng.pdf · Open Basic is developed for embedding into user application as a scripting language. Open Basic is

Scripting - Client-side vs. Server-side Scripting

MapReduce vs Pig | MapReduce Pig Integration

Embedding Pig in scripting languages

Scripting - Home | IHO · Part 50 - Scripting 7 Figure 2 - Scripting Catalogue / Host interaction within a Scripting Domain 50-6.1 Distribution The distribution mechanism of a scripting

More Scripting Techniques Scripting Process Example Script 1

Scripting versus Programming Scripting 1 bash · Scripting Computer Organization I 1 CS@VT ©2005-2016 McQuain Scripting versus Programming bash supports a scripting language. Programming

Keysight Technologies De-Embedding and Embedding …literature.cdn.keysight.com/litweb/pdf/5980-2784EN.pdf · Keysight Technologies De-Embedding and Embedding S ... Advanced Design

Pig health= pig wealth

DIGIT APPS SELECTING PROGRAMMING LANGUAGES FOR THE … · 11/28/2017 Selecting Programming Languages for the IoT ... Ericsson Mobility Report ... Pig Latin scripting language

1 Topics: Scripting What is game scripting Scripting Features Torque Script

Scripting versus Programming Scripting 1 bash

AE6382 Introduction to Scripting AE 6382. What is a scripting l Scripting is the process of programming using a scripting language l A scripting language,

Best Practice Web Client Scripting - Kofax Web Client scripting can be divided in client-side scripting and server-side scripting. 1.2.1 Client-side Scripting With client-side scripting

Scripting Windows A. Habert C. Bravo Scripting Antoine ...thomashenrywarner.free.fr/eBooks/Scripting/Scripting Windows... · De Windows NT4 à Windows XP et 2003, ... constantes,

Distributed Systems - Rutgers Universitypxk/417/notes/content/20-graph-computing-slides.pdfApache Pig •Why? –Make it easy to use MapReduce via scripting instead of Java –Make

¡Peppa Pig en tu fiesta! · 03/09/2018 ¡Peppa Pig en tu fiesta! SHOW DESCRIPCION COSTO Cuento 8 PERSONAJES: Peppa Pig, George, Mamá Pig, Papá Pig…

SCRIPTING REFERENCE BETA - Blackmagic Designdocuments.blackmagicdesign.com/.../Fusion_8_Scripting_Reference.pdf · CONTENTS SCRIPTING REFERENCE FUSION SCRIPTING REFERENCE Fusion 8

MODULE 2 - pace.edu.in · Apache Pig Apache Pig is high level language that enables programmer to write complex MapReduce transformation using simple scripting language. Pig often

Pig Veterinary Society THE CASUALTY PIG - … Pig - April 2013-1... · 1 Pig Veterinary Society THE CASUALTY PIG Interim Update April 2013