Age | Commit message (Collapse) | Author |
|
closes #1627
|
|
Added a compiler option in CMakeLists.txt to support the ISO C++ 2011 standard.
Also, changed the CMake min version to 3.1.3 to match the min version specified in protobuf.
closes #1697
|
|
Queues.
closes #1677
|
|
does not return values is called
- Fix check for return function value to handle the case when created object is returned without assigning it to the local variable
closes #1687
|
|
- Update dependencies versions
- Exclude dependency from jdbc-all
- Removal redundant "bcpkix-jdk15on" exclusion
- Proper exclusion of "jackson-dataformat-hocon" dependency
- Removal redundant "excludeSubprojects" config property for Maven Rat Plugin
closes #1682
|
|
destruction due to communication error
closes #1660
|
|
closes #1674
|
|
- replaced all String path representation with org.apache.hadoop.fs.Path
- added PathSerDe.Se JSON serializer
- refactoring of DFSPartitionLocation code by leveraging existing listPartitionValues() functionality
closes #1657
|
|
1. Updated protobuf to version 3.6.1
2. Added protobuf to the root pom dependency management
3. Added classes BoundedByteString and LiteralByteString for compatibility with HBase
4. Added ProtobufPatcher to provide compatibility with MapR-DB and HBase
closes #1639
|
|
parquet reader is used
closes #1655
|
|
closes #1661
|
|
Add support for avg row-width and major type statistics.
Parallelize the ANALYZE implementation and stats UDF implementation to improve stats collection performance.
Update/fix rowcount, selectivity and ndv computations to improve plan costing.
Add options for configuring collection/usage of statistics.
Add new APIs and implementation for stats writer (as a precursor to Drill Metastore APIs).
Fix several stats/costing related issues identified while running TPC-H nad TPC-DS queries.
Add support for CPU sampling and nested scalar columns.
Add more testcases for collection and usage of statistics and fix remaining unit/functional test failures.
Thanks to Venki Korukanti (@vkorukanti) for the description below (modified to account for new changes). He graciously agreed to rebase the patch to latest master, fixed few issues and added few tests.
FUNCS: Statistics functions as UDFs:
Separate
Currently using FieldReader to ensure consistent output type so that Unpivot doesn't get confused. All stats columns should be Nullable, so that stats functions can return NULL when N/A.
* custom versions of "count" that always return BigInt
* HyperLogLog based NDV that returns BigInt that works only on VarChars
* HyperLogLog with binary output that only works on VarChars
OPS: Updated protobufs for new ops
OPS: Implemented StatisticsMerge
OPS: Implemented StatisticsUnpivot
ANALYZE: AnalyzeTable functionality
* JavaCC syntax more-or-less copied from LucidDB.
* (Basic) AnalyzePrule: DrillAnalyzeRel -> UnpivotPrel StatsMergePrel FilterPrel(for sampling) StatsAggPrel ScanPrel
ANALYZE: Add getMetadataTable() to AbstractSchema
USAGE: Change field access in QueryWrapper
USAGE: Add getDrillTable() to DrillScanRelBase and ScanPrel
* since ScanPrel does not inherit from DrillScanRelBase, this requires adding a DrillTable to the constructor
* This is done so that a custom ReflectiveRelMetadataProvider can access the DrillTable associated with Logical/Physical scans.
USAGE: Attach DrillStatsTable to DrillTable.
* DrillStatsTable represents the data scanned from a corresponding ".stats.drill" table
* In order to avoid doing query execution right after the ".stats.drill" table is found, metadata is not actually collected until the MaterializationVisitor is used.
** Currently, the metadata source must be a string (so that a SQL query can be created). Doing this with a table is probably more complicated.
** Query is set up to extract only the most recent statistics results for each column.
closes #729
|
|
|
|
closes #1642
- Add output column names to JdbcRecordReader and use them for storing the results since column names in result set may differ when aliases aren't specified
|
|
closes #1530
|
|
close apache/drill#1629
|
|
1. HiveTestBase data initialization moved to static block
to be initialized once for all derivatives.
2. Extracted Hive driver and storage plugin management from HiveTestDataGenerator
to HiveTestFixture class. This increased cohesion of generator and
added loose coupling between hive test configuration and data generation
tasks.
3. Replaced usage of Guava ImmutableLists with TestBaseViewSupport
helper methods by using standard JDK collections.
closes #1613
|
|
binary table
1. Added persistence of MAP key and value types in Drill views (affects .view.drill file) for avoiding cast problems in future.
2. Preserved backward compatibility of older view files by treating untyped maps as ANY.
closes #1602
|
|
plugin when native reader is enabled
closes #1610
|
|
closes #1604
|
|
- Remove plugins usage for instantiating test databases and tables
- Replace derby with h2 database
closes #1603
|
|
If none of the project / filter columns, exist in the vector, ensureAtLeastOneField (or the Scan operator) adds at least one field as nullable integer (or nullable varchar if `allTextmode` is enabled).
The downstream Filter operator would then go on to fail with `NumberFormatException` because it tries to convert empty fields to integers.
Since ensureAtLeastOneField is called after reading all the messages in a batch, it can be skipped if the batch is empty.
closes #1595
|
|
DrillConnectionImpl
closes #1596
|
|
1. Added DrillHiveViewTable which allows construction of DrillViewTable based
on Hive metadata
2. Added initialization of DrillHiveViewTable in HiveSchemaFactory
3. Extracted conversion of Hive data types from DrillHiveTable
to HiveToRelDataTypeConverter
4. Removed throwing of UnsupportedOperationException from HiveStoragePlugin
5. Added TestHiveViewsSupport and authorization tests
6. Added closeSilently() method to AutoCloseables
closes #1559
|
|
closes #1575
|
|
closes #1586
|
|
|
|
|
|
- use ${maven.multiModuleProjectDirectory}/header to find header file from any submodule
- suppress UnresolvedMavenProperty, since IDE expects that property should be set explicitly
- update "kr.motd.maven:os-maven-plugin" github.com/trustin/os-maven-plugin to the latest 1.6.1 version
- correction of ${user.name} propery for "maven-jar-plugin" <Built-By>
- update "apache-rat-plugin" to solve undefined "excludeSubprojects" in IDE
- regenerate Java and C++ protobuf files
closes #1585
|
|
|
|
non-Linux systems
closes #1580
|
|
- downgrade maven-javadoc-plugin version
- update some Drill maven plugins versions and move them to pluginManagement block
- bump up lowest maven version supported by Drill in correspondence to org.apache.maven dependencies
closes #1574
|
|
account for DrillSemiJoin
closes #1568
|
|
connection
- Added session-scoped option `drill.exec.fetch_resultset_for_ddl` to control whether update count or result set should be returned for JDBC connection session. By default the option is set to `true` which ensures that result set is returned;
- Updated Drill JDBC: `DrillCursor` and `DrillStatement` to achieve desired behaviour.
closes #1549
|
|
closes #1554
|
|
plugin
closes #1542
|
|
DrillFilterRel
- Fix workspace case insensitivity for JDBC storage plugin
|
|
- Fix RDBMS integration tests (expected decimal output and testCrossSourceMultiFragmentJoin)
- Update libraries versions
- Resolve NPE for empty result
|
|
closes #1550
|
|
sun/misc/VM
closes #1446
|
|
secondary index.
|
|
closes #1532
|
|
1. Added enableStringsSignedMinMax parquet format plugin config and store.parquet.reader.strings_signed_min_max session option to control reading binary statistics for files generated by prior versions of Parquet 1.10.0.
2. Added ParquetReaderConfig to store configuration needed during reading parquet statistics or files.
3. Provided mechanism to enable varchar / decimal filter push down.
4. Added VersionUtil to compare Drill versions in string representation.
5. Added appropriate unit tests.
closes #1537
|
|
closes #1388
|
|
closes #1527
|
|
closes #1518
|
|
|
|
DRILL-6381: Add missing joinControl logic for INTERSECT_DISTINCT.
- Modified HashJoin's probe phase to process INTERSECT_DISTINCT.
- NOTE: For build phase, the functionality will be same as for SemiJoin when it is added later.
DRILL-6381: Address code review comment for intersect_distinct.
DRILL-6381: Rebase on latest master and fix compilation issues.
DRILL-6381: Generate protobuf files for C++ native client.
DRILL-6381: Use shaded Guava classes. Add more comments and Javadoc.
|
|
javadoc.
|
|
|