Transforming Data within a Hadoop Cluster - Pentaho Big Data

Dashboard >
Pentaho Big Data >
Pentaho Big Data Community Home >
How To's >
Hadoop >
… >
Transforming Data within a Hadoop Cluster

	View • Edit • • • Log In •
	Transforming Data within a Hadoop Cluster

{0}."> {0}-{1} of {2} pages containing {3}.">

Loading Data into a Hadoop Cluster

Hadoop

Extracting Data from the Hadoop Cluster

How to transform data within the Hadoop cluster using Pentaho MapReduce, Hive, and Pig.

Using Pentaho MapReduce to Parse Weblog Data — How to use Pentaho MapReduce to convert raw weblog data into parsed, delimited records.
Using Pentaho MapReduce to Generate an Aggregate Dataset — How to use Pentaho MapReduce to transform and summarize detailed data into an aggregate dataset.
Transforming Data within Hive — How to read data from a Hive table, transform it, and write it to a Hive table within the workflow of a PDI job.
Transforming Data with Pig — How to invoke a Pig script from a PDI job.
Using Pentaho MapReduce to Parse Mainframe Data — How to use Pentaho to ingest a Mainframe file into HDFS, then use MapReduce to process into delimited records.

This documentation is maintained by the Pentaho community, and members are encouraged to create new pages in the appropriate spaces, or edit existing pages that need to be corrected or updated.

Please do not leave comments on Wiki pages asking for help. They will be deleted. Use the forums instead.

Browse Space

Pages
Labels
Attachments
News
Advanced

Add Content

Notation Guide

Your Account
Anonymous

History
Log In
Sign Up

Adaptavist Theme Builder (4.2.0) Powered by Atlassian Confluence 3.3.3, the Enterprise Wiki

gipoco.com is neither affiliated with the authors of this page or responsible
for its contents. This is a safe-cache copy of the original web site.

gipoco.com is neither affiliated with the authors of this page nor responsible for its contents. This is a safe-cache copy of the original web site.