As part of the push to expand personalized offerings to private and business customers, as well as the need to more efficiently utilize investments into the Oracle Enterprise Data Warehouse platform, Telefónica Germany, one of the largest telecommunications companies in the world, decided to move forward with a new Big Data platform – one that enabled clustering with Hadoop.The decision to push into Big Data was made, but the challenge was still ahead to find the best platform among a wide range of options
EPAM Helps Telefónica Get Data Running 10× Faster
EPAM specialists ran with three priority use cases for Big Data needs and found a way forward derived from truly agile collaboration.
Key Challenges
- Find the best platform among a wide range of options
- Enable growth, efficiency and insights for the future
Solution Highlights
- EPAM specialists from the Advanced Technology Lab ran with three priority use cases for Big Data needs – relying heavily on the elaboration and testing of various platform and configuration options in a truly Agile BI Lab environment. The process, which was time-boxed at three months was concluded with a fully benchmarked analysis and a true platform ‘bake-off’. The solution included:Implementations of the selected use cases done in three technology alternatives: Plain Java MapReduce, Apache Hive, Apache Spark
- Final implemented solution for Hive without using Oozie
- Full data access on Hadoop from Oracle
- Publishing to Oracle using the following methods: - Sqoop - Oracle Loader for Hadoop (OLH)
- Using JDBC interface
- Using OCI Direct Path interface
- Approaches to Cluster Management: user and resource management
The Results
Working closely with Telefónica, the EPAM team was able to demonstrate several implemented solutions. In the end, the client selected the most successful one, which utilized EPAM’s own Cloud infrastructure and obfuscated data. Past the initial proof of concept stage, the complete solutions were deployed and rigorously tested on Telefónica BIC’s infrastructure.
As the benchmarks were completed against the defined decision criteria, the developed solutions can allow for the processing of large amounts of unstructured data at speeds up to 10 times faster than previous alternatives. Beyond the benchmarks, the team’s efforts enabled the creation of a BI blueprint that takes Telefónica forward towards a scalable technological base for future Business Intelligence solutions, including new Big Data tools and techniques.
Technologies Used
- Hadoop 2.0.6
- Apache Hive, Apache Spark
- Oracle DB 11g, Oracle Loader for Hadoop
- Java, HiveQL, Scala
- Development environment: EPAM Cloud
GET IN TOUCH
Hi! We’d love to hear from you.
Are you ready to design the business models of tomorrow?