learning_apache_drill.pdf

(8664 KB) Pobierz
Learning
Apache Drill
Charles Givre & Paul Rogers
QUERY AND ANALYZE DISTRIBUTED DATA SOURCES WITH SQL
Query and Analyze Distributed Data Sources
with SQL
Learning Apache Drill
Charles Givre and Paul Rogers
Beijing
Boston Farnham Sebastopol
Tokyo
Learning Apache Drill
by Charles Givre and Paul Rogers
Copyright © 2019 Charles Givre and Paul Rogers. All rights reserved.
Printed in the United States of America.
Published by O’Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472.
O’Reilly books may be purchased for educational, business, or sales promotional use. Online editions are
also available for most titles (http://oreilly.com/safari). For more information, contact our corporate/insti‐
tutional sales department: 800-998-9938 or
corporate@oreilly.com.
Acquisitions Editor:
Rachel Roumeliotis
Development Editor:
Jeff Bleiel
Production Editor:
Melanie Yarbrough
Copyeditor:
Octal Publishing, LLC
Proofreader:
Rachel Head
October 2018:
First Edition
Indexer:
Ellen Troutman-Zaig
Interior Designer:
David Futato
Cover Designer:
Karen Montgomery
Illustrator:
Rebecca Demarest
Revision History for the First Edition
2018-10-29:
First Release
See
http://oreilly.com/catalog/errata.csp?isbn=9781492032793
for release details.
The O’Reilly logo is a registered trademark of O’Reilly Media, Inc.
Learning Apache Drill,
the cover image,
and related trade dress are trademarks of O’Reilly Media, Inc.
The views expressed in this work are those of the authors, and do not represent the publisher’s views.
While the publisher and the authors have used good faith efforts to ensure that the information and
instructions contained in this work are accurate, the publisher and the authors disclaim all responsibility
for errors or omissions, including without limitation responsibility for damages resulting from the use of
or reliance on this work. Use of the information and instructions contained in this work is at your own
risk. If any code samples or other technology this work contains or describes is subject to open source
licenses or the intellectual property rights of others, it is your responsibility to ensure that your use
thereof complies with such licenses and/or rights.
978-1-492-03279-3
[LSI]
Table of Contents
Preface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
1.
Introduction to Apache Drill. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
What Is Apache Drill?
Drill Is Versatile
Drill Is Easy to Use
A Word About Drill’s Performance
A Very Brief History of Big Data
Drill in the Big Data Ecosystem
Comparing Drill with Similar Tools
2
2
3
4
5
6
7
2.
Installing and Running Drill. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Preparing Your Machine for Drill
Special Configuration Instructions for Windows Installations
Installing Drill on Windows
Starting Drill on a Windows Machine
Installing Drill in Embedded Mode on macOS or Linux
Starting Drill on macOS or Linux in Embedded Mode
Installing Drill in Distributed Mode on macOS or Linux
Preparing Your Cluster for Drill
Starting Drill in Distributed Mode
Connecting to the Cluster
Conclusion
The Apache Hadoop Ecosystem
Drill Is a Low-Latency Query Engine
Distributed Processing with HDFS
10
10
12
12
13
13
14
15
16
16
16
17
18
18
iii
3.
Overview of Apache Drill. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Zgłoś jeśli naruszono regulamin