Hadoop Project for Advanced Database

Description

This project serves as my final project for the course Advanced Database(UML 91.673). The project is to deploy Hadoop and use MapReduce to deal with large amount of raw data in text format.

Guide

This folder stores the necessary data source, code as well as the output for the Advanced Database Project, the structure is as follows:

data: the source data file that stores the manuplated data(city.txt, country.txt, countrylanguage.txt)

output: the output result of each experiments, there is a flag file indicating the status, and the other part-r-00000 is the generated result.

src: the source code of the project written in Java, for each seperate task there is a package named ex1, ex2,ex3, ex4.

INSTALL_GUIDE_2nd.pdf: the guide that I referred when configuring the Hadoop environment

Details

For detailed information, please check the write up report

Contact

Email: Chang Liu(chang_liu@student.uml.edu)