Apache ORC <img src="https://cdn.statically.io/img/orc.apache.org/img/logo.png" width="249" height="101" alt="ORC Logo">

The ORC team is excited to announce the release of ORC v2.0.2.

Released: 15 August 2024
Source code: orc-2.0.2.tar.gz
GPG Signature signed by Dongjoon Hyun (34F0FC5C)
Git tag: rel/release-2.0.2
Maven Central: ORC 2.0.2
SHA 256: fabdee3e8acd64da…
Fixed issues: ORC-2.0.2

The improvements (tools):

ORC-1724 JsonFileDump utility should print user metadata
ORC-1740 Avoid the dump tool repeatedly parsing ColumnStatistics
ORC-1742 Support print the id, name and type of each column in dump tool

The bug fixes:

ORC-1732 [C++] Fix detecting Homebrew-installed Protobuf on MacOS
ORC-1733 [C++][CMake] Fix CMAKE_MODULE_PATH not to use PROJECT_SOURCE_DIR
ORC-1738 [C++] Fix wrong Int128 maximum value
ORC-1741 Respect decimal reader isRepeating flag
ORC-1749 Fix supportVectoredIO for hadoop version string with optional patch labels
ORC-1751 [C++] Fix syntax error in ThirdpartyToolchain

The test changes:

ORC-1694 Upgrade gson to 2.9.0 for Benchmarks Hive
ORC-1697 Fix IllegalArgumentException when reading json timestamp type in benchmark
ORC-1700 Write parquet decimal type data in Benchmark using FIXED_LEN_BYTE_ARRAY type
ORC-1743 Upgrade Spark to 4.0.0-preview1
ORC-1744 Add ubuntu-24.04 to GitHub Action
ORC-1746 Bump netty-all to 4.1.110.Final in bench module
ORC-1752 Fix NumberFormatException when reading json timestamp type in benchmark
ORC-1753 Use Avro 1.12.0 in bench module

The build and dependency changes:

ORC-1721 Upgrade aircompressor to 0.27
ORC-1747 Upgrade zstd-jni to 1.5.6-4

ORC 1.9.4 Released

release

16 Jul 2024

The ORC team is excited to announce the release of ORC v1.9.4.

Released: 16 July 2024
Source code: orc-1.9.4.tar.gz
GPG Signature signed by William Hyun (DECDFA29)
Git tag: rel/release-1.9.4
Maven Central: ORC 1.9.4
SHA 256: d9a6bcc00e07a6e5…
Fixed issues: ORC-1.9.4

The bug fixes:

ORC-1696 Fix ClassCastException when reading avro decimal type in bechmark
ORC-1721 Upgrade aircompressor to 0.27
ORC-1738 Wrong Int128 maximum value

The test changes:

ORC-1619 Add MacOS 14 to GitHub Action
ORC-1699 Fix SparkBenchmark in Parquet format according to SPARK-40918

The task changes:

ORC-1540 Remove MacOS 11 from GitHub Action CI

ORC 2.0.1 Released

release

14 May 2024

The ORC team is excited to announce the release of ORC v2.0.1.

Released: 14 May 2024
Source code: orc-2.0.1.tar.gz
GPG Signature signed by William Hyun (DECDFA29)
Git tag: rel/release-2.0.1
Maven Central: ORC 2.0.1
SHA 256: 1ffac0228aa83f04…
Fixed issues: ORC-2.0.1

The improvements (tools):

ORC-1644 Add merge tool to merge multiple ORC files into a single ORC file
ORC-1647 Tips for supporting ORC in the convert command
ORC-1667 Add check tool to check the index of the specified column

The bug fixes:

ORC-1646 Close the reader when reading the schema with the convert command
ORC-1654 [C++] Count up EvaluatedRowGroupCount correctly
ORC-1684 [C++] Find tzdb without TZDIR when in conda-environments
ORC-1688 [C++] Do not access TZDB if there is no timestamp type
ORC-1696 Fix ClassCastException when reading avro decimal type in bechmark The tasks:
ORC-1649 [C++][Conan] Add 2.0.0 to conan recipe and update release guide
ORC-1669 [C++] Deprecate HDFS support
ORC-1686 [C++] Avoid using std::filesystem

The test changes:

ORC-1648 Add test to convert ORC in the convert command
ORC-1663 [C++] Enable TestTimezone.testMissingTZDB on Windows
ORC-1672 Remove test packages o.a.o.tools.check
ORC-1673 Remove test packages o.a.o.tools.[count|merge|sizes]
ORC-1676 Use Hive 4.0.0 in benchmark
ORC-1681 Remove redundant import statement in tests to fix checkstyle failures
ORC-1699 Fix SparkBenchmark in Parquet format according to SPARK-40918
ORC-1704 Migration to Scala 2.13 of Apache Spark 3.5.1 at SparkBenchmark
ORC-1707 Fix sun.util.calendar IllegalAccessException when SparkBenchmark runs on JDK17
ORC-1708 Support data/compress options in Hive benchmark

The build and dependency changes:

ORC-1670 Upgrade zstd-jni to 1.5.6-1
ORC-1679 Bump zstd-jni to 1.5.6-2
ORC-1695 Upgrade gson to 2.10.1
ORC-1698 Upgrade commons-cli to 1.7.0
ORC-1705 Upgrade zstd-jni to 1.5.6-3
ORC-1714 Bump commons-csv to 1.11.0
ORC-1715 Bump org.objenesis:objenesis to 3.3

The documentation changes:

ORC-1668 Add merge command to Java tools documentation

Shaoyun Chen and Yuanping Wu added as committers

team

13 May 2024

The ORC PMC is happy to add Shaoyun Chen and Yuanping Wu as committers for their work on ORC Java and C++ library.

Thank you for your work on ORC, Shaoyun and Yuanping!

ORC 1.8.7 Released

release

14 Apr 2024

The ORC team is excited to announce the release of ORC v1.8.7.

Released: 14 April 2024
Source code: orc-1.8.7.tar.gz
GPG Signature signed by Dongjoon Hyun (34F0FC5C)
Git tag: rel/release-1.8.7
Maven Central: ORC 1.8.7
SHA 256: 57c9d12bf74b2752…
Fixed issues: ORC-1.8.7

The bug fixes:

ORC-1528: Fix readBytes potential overflow in RecordReaderUtils.ChunkReader#create
ORC-1602: [C++] limit compression block size

The test changes:

ORC-1556 Add Rocky Linux 9 Docker Test
ORC-1557 Add GitHub Action CI for Docker Test
ORC-1560 Remove Java11 and clang variants from docker/os-list.txt in branch-1.8
ORC-1562 Bump guava to 33.0.0-jre
ORC-1578 Fix SparkBenchmark on sales data according to SPARK-40918
ORC-1621 Switch to oraclelinux9 from rocky9

The documentations:

ORC-1536 Remove hive-storage-api link from maven-javadoc-plugin
ORC-1563 Fix orc.bloom.filter.fpp default value and orc.compress notes of Spark and Hive config docs

ORC 1.9.3 Released

release

20 Mar 2024

The ORC team is excited to announce the release of ORC v1.9.3.

Released: 20 March 2024
Source code: orc-1.9.3.tar.gz
GPG Signature signed by Gang Wu (578F619B)
Git tag: rel/release-1.9.3
Maven Central: ORC 1.9.3
SHA 256: f737d005d0c4deb6…
Fixed issues: ORC-1.9.3

The bug fixes:

ORC-634 Fix the json output for double NaN and infinite
ORC-1553 Reading information from Row group, where there are 0 records of SArg column
ORC-1563 Fix orc.bloom.filter.fpp default value and orc.compress notes of Spark and Hive config docs
ORC-1578 Fix SparkBenchmark according to SPARK-40918
ORC-1586 Fix IllegalAccessError when SparkBenchmark runs on JDK17
ORC-1602 [C++] limit compression block size
ORC-1607 Fix testDoubleNaNAndInfinite to use TestFileDump.checkOutput
ORC-1609 Fix the compilation problem of TestJsonFileDump in branch 1.9

The test changes:

ORC-1556 Add Rocky Linux 9 Docker Test
ORC-1557 Add GitHub Action CI for Docker Test
ORC-1559 Remove Java11 and clang variants from docker/os-list.txt from branch-1.9

The tasks:

ORC-1532 Upgrade opencsv to 5.9
ORC-1536 Remove hive-storage-api link from maven-javadoc-plugin
ORC-1576 Upgrade spark.jackson.version to 2.15.2 in bench module
ORC-1591 Lower log level from INFO to DEBUG in *ReaderImpl/WriterImpl/PhysicalFsWriter
ORC-1592 Suppress KeyProvider missing log
ORC-1616 Upgrade aircompressor to 0.26
ORC-1618 Disable building tests for snappy

Documentation:

ORC-1535 Remove generated Java docs from source tree

ORC 2.0.0 Released

release

08 Mar 2024

The ORC team is excited to announce the release of ORC v2.0.0.

Released: 8 March 2024
Source code: orc-2.0.0.tar.gz
GPG Signature signed by Dongjoon Hyun (34F0FC5C)
Git tag: rel/release-2.0.0
Maven Central: ORC 2.0.0
SHA 256: 9107730919c29eb3…
Fixed issues: ORC-2.0.0

New Feature and Notable Changes:

ORC-998: Refactor compression output buffer within OutStream for better portability
ORC-1088: Suport ZSTD_JNI and columnn compress to set compression level
ORC-1100: Support vcpkg
ORC-1251: Use Hadoop Vectored IO
ORC-1387: [C++] Support schema evolution from decimal to numeric/decimal
ORC-1440: Check for protobuf config based module
ORC-1463: Support brotli codec
ORC-1507: Use Zulu JDK distribution and switch from 21-ea to 21
ORC-1512: Drop Java 8/11 and make Java 17 by default
ORC-1531: Create orc-format module and repo
ORC-1545: Use orc-format 1.0.0-SNAPSHOT
ORC-1546: Use orc-format 1.0.0-alpha
ORC-1547: Spin-off ORC Format
ORC-1551: Use orc-format 1.0.0-beta
ORC-1572: Use Apache ORC Format 1.0.0
ORC-1585: [C++] Add orc-format_ep as a dependency of orc

Improvements:

ORC-1459: Mark DataBuffer::size() and DataBuffer::capacity() as const
ORC-1460: specification: Clarify how dictionary entries are sorted
ORC-1461: Mark Int128::getHighBits() and Int128::getLowBits() as const
ORC-1472: Replace deprecated method in TestMurmur3.java
ORC-1479: Enhance example usage message to use Uber jar
ORC-1481: [C++] Better error message when TZDB is unavailable
ORC-1504: Add lower bound check in get API for DynamicIntArray
ORC-1506: Replacing deprecated valueOf() with recommended forNumber()
ORC-1509: Auto grant contributor role to first-time contributors
ORC-1520: Remove JDK 8 settings from pom
ORC-1567: Add the -ignoreExtension configuration to the sizes and count commands of orc-tools
ORC-1570: Add supportVectoredIO API to HadoopShimsCurrent and use it
ORC-1571: Supports displaying raw data size in the meta command of orc-tools
ORC-1577: Use ZSTD as the default compression
ORC-1580: Change default DataBuffer constructor to use reserve instead of resize
ORC-1595: Add a short-cut to skip tiny inputs for ZstdCodec.compress
ORC-1596: Remove redundant Zstd.isError JNI usage
ORC-1597: Set bloom filter fpp to 1%
ORC-1600: Reduce getStaticMemoryManager sync block in OrcFile
ORC-1601: Reduce get HadoopShims sync block in HadoopShimsFactory
ORC-1610: Reduce the number of hash computation in CuckooSetBytes
ORC-1613: Zstd decompression supports direct buffer
ORC-1631: Supports summary output in sizes command
ORC-1637: [C++] Port conan recipe from upstream conan center
ORC-1638: Avoid System.exit(0) in count command
ORC-1639: [C++] Reduce unnecessary compiler flags in CMake
ORC-1641: Remove sourceFileExcludes from maven-javadoc-plugin
ORC-1642: Avoid System.exit(0) in scan command
ORC-1593: Set orc.compression.zstd.level to 3 by default

Bug Fixes:

ORC-634: Fix the json output for double NaN and infinite
ORC-1455: [C++] Fix build failure on non-x86 with unused macro in CpuInfoUtil.cc
ORC-1473: Zero-copy zeroCopyReadRanges and releaseBuffer bugs
ORC-1476: Maven build fail with unsupported platform: protoc-3.17.3-osx-aarch_64.exe
ORC-1480: [C++] Build failed when the BUILD_CPP_ENABLE_METRICS is ON
ORC-1500: [C++] The partition field does not support English special characters
ORC-1528: When using the orc.min.disk.seek.size configuration to read extremely large ORC files, a java.nio.BufferOverflowException may occur.
ORC-1553: Reading information from Row group, where there are 0 records of SArg column
ORC-1563: Fix orc.bloom.filter.fpp default value and orc.compress notes of Spark and Hive config docs
ORC-1568: Use readDiskRanges if orc.use.zerocopy is enabled
ORC-1575: Use ASF Archive URL instead Download URL
ORC-1578: Fix SparkBenchmark according to SPARK-40918
ORC-1588: Fix incorrect Decimal assert in LeafFilterFactory
ORC-1602: [C++] limit compression block size

Tasks:

ORC-1422: Setting version to 2.0.0-SNAPSHOT
ORC-1434: Remove org.apache.hadoop from dependabot.yml
ORC-1484: Use JIRA_ACCESS_TOKEN in merge_orc_pr.py
ORC-1485: Enable checkstyle checks for test classes
ORC-1486: Fix checkstyle violations for tests in orc-core module
ORC-1492: Fix checkstyle violations for tests in mapreduce, tools, bench modules
ORC-1496: Use iterator to suggest backporting branches
ORC-1515: Skip publishing orc-example module
ORC-1516: Fix minor typo in comments in IOUtils
ORC-1518: Remove findbugs folders
ORC-1529: Fix minor typos in pom.xml
ORC-1530: Rename variables in RecordReaderUtils.ChunkReader#create
ORC-1535: Remove generated Java docs from source tree
ORC-1536: Remove hive-storage-api link from maven-javadoc-plugin
ORC-1540: Remove MacOS 11 from GitHub Action CI
ORC-1542: Use Pattern Matching for instanceof (JEP-394)
ORC-1549: Update libhdfspp.tar.gz by adding #include <cstdint>
ORC-1569: Remove HadoopShimsPre2_3, HadoopShimsPre2_6, HadoopShimsPre2_7 classes
ORC-1579: Add ASF Generative Tooling Guidance to PR template
ORC-1591: Lower log level from INFO to DEBUG in *ReaderImpl/WriterImpl/PhysicalFsWriter
ORC-1592: Suppress KeyProvider missing log
ORC-1598: Close reader in orc-examples
ORC-1604: Deprecate non-utf8 bloom filter for Java writer

Tests:

ORC-1003: Recover java-examples-test
ORC-1409: Add stream order description in ORC spec.
ORC-1432: Add MacOS 13 GitHub Action Job
ORC-1474: Replaced deprecated getMinimum/Maximum in TestColumnStatistics
ORC-1475: [C++] ConvertColumnReader.TestConvertNumericToStringVariant fails when compiled with unsigned char
ORC-1477: Remove unused imports from Test classes
ORC-1478: Add Unit Test for org.apache.orc.impl.DynamicIntArray
ORC-1510: Fix package for TestOrcUtils and add more test cases
ORC-1541: Add Ubuntu 24.04 LTS Docker Test
ORC-1555: Simplify fedora37 docker image
ORC-1556: Add Rocky Linux 9 Docker Test
ORC-1557: Add GitHub Action CI for Docker Test
ORC-1558: Remove ubuntu22_jdk=21 and ubuntu22_jdk=21_cc=clang test combinations from docker/os-list.txt
ORC-1574: Update GitHub Action YAML files in branch-2.0
ORC-1586: Fix IllegalAccessError when SparkBenchmark runs on JDK17
ORC-1607: Fix testDoubleNaNAndInfinite to use TestFileDump.checkOutput
ORC-1614: Set ByteBuffer limit in TestBrotli test
ORC-1618: Disable building tests for snappy
ORC-1619: Add MacOS 14 to GitHub Action
ORC-1620: Add Apple Silicon Test Coverage
ORC-1621: Switch to oraclelinux9 from rocky9
ORC-1623: Use directOut.put(out) instead of directOut.put(out.array()) in TestZstd test
ORC-1630: Test using VectoredIO of hadoop to read ORC
ORC-1632: Add test for count command
ORC-1633: Add test for sizes command
ORC-1643: Add test for scan command

Build and dependency changes:

ORC-870: Unpin and upgrade jmh to 1.37
ORC-1423: Bump build-helper-maven-plugin to 3.4.0
ORC-1424: Bump maven-assembly-plugin to 3.6.0
ORC-1425: Bump checkstyle to 10.11.0
ORC-1427: Use Hadoop 3.3.5 in tools module
ORC-1429: Upgrade Maven to 3.8.8
ORC-1430: Use Hadoop 3.3.5 shaded clients
ORC-1431: Use parquet to 1.13.1 in bench module
ORC-1437: Bump checkstyle to 10.12.0
ORC-1438: Bump auto-service to 1.1.0
ORC-1439: Bump guava to 32.0.0-jre
ORC-1442: Update guava to 32.0.1
ORC-1445: Bump snappy-java to 1.1.10.1 in bench module
ORC-1448: Bump auto-service to 1.1.1
ORC-1456: Update Hadoop to 3.3.6
ORC-1466: Bump junit to 5.10.0
ORC-1467: Upgrade commons-lang3 to 3.13.0
ORC-1468: Bump opencsv to 5.8
ORC-1469: Update guava to 32.1.2
ORC-1470: Update maven-shade-plugin to 3.5.0
ORC-1493: Bump byte-buddy to 1.14.6
ORC-1502: Upgrade Maven to 3.9.4
ORC-1508: Upgrade slf4j to 2.0.9
ORC-1513: Upgrade snappy to 1.1.10.4
ORC-1514: Remove zookeeper runtime dependency
ORC-1517: Bump snappy-java to 1.1.10.5 in bench module
ORC-1521: Bump com.google.guava:guava to 32.1.3-jre
ORC-1522: Bump commons-cli:commons-cli to 1.6.0
ORC-1523: Bump maven-checkstyle-plugin to 3.3.1
ORC-1524: Bump maven-shade-plugin to 3.5.1
ORC-1526: Bump spotbugs-maven-plugin to 4.8.1.0
ORC-1527: Bump junit to 5.10.1
ORC-1533: Upgrade commons-lang3 to 3.14.0
ORC-1534: Upgrade build-helper-maven-plugin to 3.5.0
ORC-1537: Unpin and upgrade spotless to 2.41.0
ORC-1538: Unpin and upgrade maven-dependency-plugin to 3.6.1
ORC-1543: Bump spotless-maven-plugin to 2.41.1
ORC-1544: Unpin and upgrade protobuf-java to 3.25.1
ORC-1550: Upgrade Maven to 3.9.6
ORC-1562: Bump com.google.guava:guava to 33.0.0-jre
ORC-1565: Bump slf4j.version to 2.0.10
ORC-1566: Make Brotli dependency as optional
ORC-1576: Upgrade spark.jackson.version to 2.15.2 in bench module
ORC-1581: Bump slf4j.version to 2.0.11
ORC-1582: Bump protobuf-java to 3.25.2
ORC-1605: Upgrade brotli4j to 1.16.0
ORC-1616: Upgrade aircompressor to 0.26
ORC-1624: Upgrade Spark to 3.5.1
ORC-1626: Upgrade Mockito to 5.10 and byte-buddy to 1.14.11
ORC-1627: Unpin scala-library
ORC-1628: Bump protobuf-java to 3.25.3

Documentations:

ORC-994: Fix javadoc so that it doesn’t put files into the source tree
ORC-1471: Updated README.md to use maven 3.8.8
ORC-1491: Update Python documentation with PyArrow 13.0.0 and Dask 2023.8.1
ORC-1503: Update README.md to use maven 3.9.4
ORC-1552: Update README.md to use maven 3.9.6
ORC-1564: Add Java ORC configuration documentation
ORC-1584: Remove README about Proto subdirectory
ORC-1587: Fix usage command of SparkBenchmark document
ORC-1599: Add zstd compression level and windowlog in Java configuration documentation
ORC-1612: Document available encodings at orc.compress
ORC-1625: Switch to oraclelinux9 from rocky9 in README

Deshan Xiao added as committer

team

13 Jan 2024

The ORC PMC is happy to add Deshan Xiao as an ORC committer for the work on ORC Java Brotli codec and vcpkg C++ library.

Thank you for your work on ORC, Deshan!

ORC 1.9.2 Released

release

10 Nov 2023

The ORC team is excited to announce the release of ORC v1.9.2.

Released: 10 November 2023
Source code: orc-1.9.2.tar.gz
GPG Signature signed by Dongjoon Hyun (34F0FC5C)
Git tag: rel/release-1.9.2
Maven Central: ORC 1.9.2
SHA 256: 7f46f2c184ecefd6…
Fixed issues: ORC-1.9.2

The bug fixes:

ORC-1475 [C++] Fix the failure of UT when char is unsigned
ORC-1480 [C++] Fix build break w/ BUILD_CPP_ENABLE_METRICS=ON
ORC-1482 Adaptation to read ORC files created by CUDF
ORC-1489 Assign a writer id to CUDF
ORC-1525 Fix bad read in RleDecoderV2::readByte

The test changes:

ORC-1431 Use parquet to 1.13.1 in bench module
ORC-1454 Update Spark to 3.4.1
ORC-1487 Enable checkstyle on src/test with checkstyle-suppressions.xml
ORC-1498 Add Debian 12 Docker test
ORC-1502 Upgrade Maven to 3.9.4
ORC-1505 Upgrade Spark to 3.5.0
ORC-1511 Bump Avro to 1.11.3 in bench module
ORC-1513 Upgrade snappy-java to 1.1.10.4 in bench module
ORC-1517 Bump snappy-java to 1.1.10.5 in bench module

The tasks:

ORC-1497 Bump maven-enforcer-plugin to 3.4.0
ORC-1499 Add MacOS 13 and 14 to building.md
ORC-1507 Use Zulu JDK distribution and switch from 21-ea to 21
ORC-1518 Remove findbugs folders

Documentation:

ORC-1503 Updated README.md with Maven version 3.9.4

ORC 1.8.6 Released

release

10 Nov 2023

The ORC team is excited to announce the release of ORC v1.8.6.

Released: 10 November 2023
Source code: orc-1.8.6.tar.gz
GPG Signature signed by Dongjoon Hyun (34F0FC5C)
Git tag: rel/release-1.8.6
Maven Central: ORC 1.8.6
SHA 256: 5675b18118df4dd7…
Fixed issues: ORC-1.8.6

The bug fixes:

ORC-1525 Fix bad read in RleDecoderV2::readByte

The test changes:

ORC-1432 Add MacOS 13 GitHub Action Job

Documentations:

ORC-1499 Add MacOS 13 and 14 to building.md

ORC 1.7.10 Released

release

10 Nov 2023

The ORC team is excited to announce the release of ORC v1.7.10.

Released: 10 November 2023
Source code: orc-1.7.10.tar.gz
GPG Signature signed by Dongjoon Hyun (34F0FC5C)
Git tag: rel/release-1.7.10
Maven Central: ORC 1.7.10
SHA 256: 85aef9368dc9bcdf…
Fixed issues: ORC-1.7.10

The bug fixes:

ORC-1304 [C++] Fix seeking over empty PRESENT stream
ORC-1413 Fix for ORC row level filter issue with ACID table

The task changes:

ORC-1482 Adaptation to read ORC files created by CUDF
ORC-1489 Assign a writer id to CUDF

ORC 1.8.5 Released

release

05 Sep 2023

The ORC team is excited to announce the release of ORC v1.8.5.

Released: 5 September 2023
Source code: orc-1.8.5.tar.gz
GPG Signature signed by Gang Wu (578F619B)
Git tag: rel/release-1.8.5
Maven Central: ORC 1.8.5
SHA 256: 3dfb227d9810a3b6…
Fixed issues: ORC-1.8.5

The bug fixes:

ORC-1315: [C++] Byte to integer conversions fail on platforms with unsigned char type
ORC-1482: RecordReaderImpl.evaluatePredicateProto assumes floating point stats are always present

The tasks:

ORC-1489 Assign a writer id to CUDF

ORC 1.9.1 Released

release

16 Aug 2023

The ORC team is excited to announce the release of ORC v1.9.1.

Released: 16 August 2023
Source code: orc-1.9.1.tar.gz
GPG Signature signed by Dongjoon Hyun (34F0FC5C)
Git tag: rel/release-1.9.1
Maven Central: ORC 1.9.1
SHA 256: 804492c6562516f9…
Fixed issues: ORC-1.9.1

The bug fixes:

ORC-1455 Fix build failure on non-x86 with unused macro in CpuInfoUtil.cc
ORC-1457 Fix ambiguous overload of Type::createRowBatch
ORC-1462 Bump aircompressor to 0.25 to fix JDK-8081450

The test changes:

ORC-1432 Add MacOS 13 GitHub Action Job
ORC-1464 Bump avro to 1.11.2
ORC-1465 Bump snappy-java to 1.1.10.3

ORC 1.9.0 Released

release

28 Jun 2023

The ORC team is excited to announce the release of ORC v1.9.0.

Released: 28 June 2023
Source code: orc-1.9.0.tar.gz
GPG Signature signed by Dongjoon Hyun (34F0FC5C)
Git tag: rel/release-1.9.0
Maven Central: ORC 1.9.0
SHA 256: 0dca8bbccdb2ee87…
Fixed issues: ORC-1.9.0

New Feature and Notable Changes:

ORC-961: Expose metrics of the reader
ORC-1167: Support orc.row.batch.size configuration
ORC-1252: Expose io metrics for write operation
ORC-1301: Enforce C++ 17
ORC-1310: allowlist Support for plugin filter
ORC-1356: Use Intel AVX-512 instructions to accelerate the Rle-bit-packing decode
ORC-1385: Support schema evolution from numeric to numeric
ORC-1386: Support schema evolution from primitive to string group/decimal/timestamp

Improvements:

ORC-827: Utilize Array copyOf
ORC-1170: Optimize the RowReader::seekToRow function
ORC-1232 Disable metrics collector by default
ORC-1278 Update Readme.md cmake to 3.12
ORC-1279 Update cmake version
ORC-1286 Replace DataBuffer with BlockBuffer in the BufferedOutputStream
ORC-1298 Support dedicated ColumnVectorBatch of numeric types
ORC-1302 Upgrade Github workflow to build on Windows
ORC-1306 Fixed indented code style for Java modules
ORC-1307 Add coding style enforcement
ORC-1314 Remove macros defined before C++11
ORC-1347 Use make_unique and make_shared when creating unique_ptr and shared_ptr
ORC-1348 TimezoneImpl constructor should pass std::vector<> & instead of std::vector<>
ORC-1349 Remove useless bufStream definition
ORC-1352 Remove ORC_[NOEXCEPT|NULLPTR|OVERRIDE|UNIQUE_PTR] macro usages
ORC-1355 Writer::addUserMetadata change parameter to reference
ORC-1373 Add log when DynamicByteArray length overflow
ORC-1401 Allow writing an intermediate footer
ORC-1421 Use PyArrow 12.0.0 in document

The bug fixes:

ORC-1225 Bump maven-assembly-plugin to 3.4.2
ORC-1266 DecimalColumnVector resets the isRepeating flag in the nextVector method
ORC-1273 Bump opencsv to 5.7.0
ORC-1297 Bump opencsv to 5.7.1
ORC-1304 throw ParseError when using SearchArgument with nested struct
ORC-1315 Byte to integer conversions fail on platforms with unsigned char type
ORC-1320 Fix build break of C++ code on docker images
ORC-1363 Upgrade zookeeper to 3.8.1
ORC-1368 Bump commons-csv to 1.10.0
ORC-1398 Bump aircompressor to 0.24
ORC-1399 Fix boolean type with useTightNumericVector enabled
ORC-1433 Fix comment in the Vector.hh
ORC-1447 Fix a bug in CpuInfoUtil.cc to support ARM platform
ORC-1449 Add -Wno-unused-macros for Clang 14.0
ORC-1450 Stop enforcing override keyword
ORC-1453 Fix fall-through warning cases

The test changes:

ORC-1231 Update supported OS list in building.md
ORC-1233 Bump junit to 5.9.0
ORC-1234 Upgrade objenesis to 3.2 in Spark benchmark
ORC-1235 Bump avro to 1.11.1
ORC-1240 Update site README to use apache/orc-dev
ORC-1241 Use apache/orc-dev DockerHub repository in Docker tests
ORC-1250 Bump mockito to 4.7.0
ORC-1254 Add spotbugs check
ORC-1258 Bump byte-buddy to 1.12.14
ORC-1262 Bump maven-checkstyle-plugin to 3.2.0
ORC-1265 Upgrade spotbugs to 4.7.2
ORC-1267 Bump mockito to 4.8.0
ORC-1271 Bump spotbugs-maven-plugin to 4.7.2.0
ORC-1272 Bump byte-buddy to 1.12.16
ORC-1300 Update Spark to 3.3.1 and its dependencies
ORC-1303 Upgrade GoogleTest to 1.12.1
ORC-1318 Upgrade mockito.version to 4.9.0
ORC-1319 Upgrade byte-buddy to 1.12.19
ORC-1321 Bump checkstyle to 10.5.0
ORC-1322 Upgrade centos7 docker image to use gcc9
ORC-1324 Use Java 19 instead of 18 in GHA
ORC-1333 Bump mockito to 4.10.0
ORC-1341 Bump mockito to 4.11.0
ORC-1353 Bump byte-buddy to 1.12.21
ORC-1359 Bump byte-buddy to 1.12.22
ORC-1366 Bump checkstyle to 10.7.0
ORC-1367 Bump maven-enforcer-plugin to 3.2.1
ORC-1369 Bump byte-buddy to 1.12.23
ORC-1370 Bump snappy-java to 1.1.9.1
ORC-1374 Update Spark to 3.3.2
ORC-1379 Upgrade spotbugs to 4.7.3.2
ORC-1380 Upgrade checkstyle to 10.8.0
ORC-1394 Bump maven-assembly-plugin to 3.5.0
ORC-1397 Bump checkstyle to 10.9.2
ORC-1405 Bump spotbugs-maven-plugin to 4.7.3.4
ORC-1406 Bump maven-enforcer-plugin to 3.3.0
ORC-1408 Add testVectorBatchHasNull test case and comment
ORC-1415 Add Java 20 to GitHub Action CI
ORC-1417 Bump checkstyle to 10.10.0
ORC-1418 Bump junit to 5.9.3
ORC-1426 Use Java 21-ea instead of 20 in GitHub Action
ORC-1435 Bump maven-checkstyle-plugin to 3.3.0
ORC-1436 Bump snappy-java to 1.1.10.0
ORC-1452 Use the latest OS versions in variant tests

The tasks:

ORC-1164 Setting version to 1.9.0-SNAPSHOT
ORC-1218 Bump apache pom to 27
ORC-1219 Remove redundant toString
ORC-1237 Remove a wrong image link to article-footer.png
ORC-1239 Upgrade maven-shade-plugin to 3.3.0
ORC-1256 Publish test-jar to maven central
ORC-1259 Bump slf4j to 2.0.0
ORC-1269 Remove FindBugs
ORC-1270 Move opencsv dependency to the tools module.
ORC-1274 Add a checkstyle rule to ban starting LAND and LOR
ORC-1275 Bump maven-jar-plugin to 3.3.0
ORC-1276 Bump slf4j to 2.0.1
ORC-1277 Bump maven-shade-plugin to 3.4.0
ORC-1284 Add permissions to GitHub Action labeler
ORC-1296 Bump reproducible-build-maven-plugin to 0.16
ORC-1311 Bump maven-shade-plugin to 3.4.1
ORC-1316 Bump slf4j.version to 2.0.4
ORC-1334 Bump slf4j.version to 2.0.6
ORC-1335 Bump netty-all to 4.1.86.Final
ORC-1351 Update PR Labeler definition
ORC-1358 Use spotless to format pom files
ORC-1371 Remove unsupported SLF4J bindings from classpath
ORC-1372 Bump zstd to v1.5.4
ORC-1375 Cancel old running ci tasks when a pr has a new commit
ORC-1377 Enforce override keyword
ORC-1383 Upgrade aircompressor to 0.22
ORC-1395 Enforce license check
ORC-1396 Bump slf4j to 2.0.7
ORC-1410 Bump zstd to v1.5.5
ORC-1411 Remove Ubuntu18.04 from docker-based tests
ORC-1419 Bump protobuf-java to 3.22.3
ORC-1428 Setup GitHub Action CI on branch-1.9
ORC-1443 Enforce Java version
ORC-1444 Enforce JDK Bytecode version
ORC-1446 Publish snapshot from branch-1.9

ORC 1.8.4 Released

release

14 Jun 2023

yqzhang

The ORC team is excited to announce the release of ORC v1.8.4.

Released: 14 June 2023
Source code: orc-1.8.4.tar.gz
GPG Signature signed by Yiqun Zhang (42E05C03)
Git tag: rel/release-1.8.4
Maven Central: ORC 1.8.4
SHA 256: 1a4400c1daea0997…
Fixed issues: ORC-1.8.4

The bug fixes:

ORC-1304: [C++] Fix seeking over empty PRESENT stream
ORC-1400: Use Hadoop 3.3.5 on Java 17+ and benchmark
ORC-1413: Fix for ORC row level filter issue with ACID table

The test changes:

ORC-1404 Bump parquet to 1.13.0
ORC-1414 Upgrade java bench module to spark3.4
ORC-1416 Upgrade Jackson dependency to 2.14.2 in bench module
ORC-1420 Pin net.bytebuddy package to 1.12.x

The tasks:

ORC-1395 Enforce license check via github action

ORC 1.7.9 Released

release

07 May 2023

The ORC team is excited to announce the release of ORC v1.7.9.

Released: 7 May 2023
Source code: orc-1.7.9.tar.gz
GPG Signature signed by Gang Wu (578F619B)
Git tag: rel/release-1.7.9
Maven Central: ORC 1.7.9
SHA 256: fa0a6ed34c1c9673…
Fixed issues: ORC-1.7.9

The bug fixes:

ORC-1382 Fix secondary config names org.sarg.* to orc.sarg.*
ORC-1395 Enforce license check
ORC-1407 Upgrade cyclonedx-maven-plugin to 2.7.6

The test changes:

ORC-1374 Update Spark to 3.3.2

ORC 1.8.3 Released

release

15 Mar 2023

The ORC team is excited to announce the release of ORC v1.8.3.

Released: 15 March 2023
Source code: orc-1.8.3.tar.gz
GPG Signature signed by Dongjoon Hyun (34F0FC5C)
Git tag: rel/release-1.8.3
Maven Central: ORC 1.8.3
SHA 256: a78678ec425c8129…
Fixed issues: ORC-1.8.3

The bug fixes:

ORC-1357: Handle missing compression block size
ORC-1382: Fix secondary config names org.sarg.* to orc.sarg.*
ORC-1384: Fix ArrayIndexOutOfBoundsException when reading dictionary stream bigger then dictionary
ORC-1393: Add reset(DiskRangeList input, long length) to InStream impl class

The test changes:

ORC-1360 Pin mockito to 4.x
ORC-1364 Pin spotless to 2.30.0
ORC-1374 Update Spark to 3.3.2

The tasks:

ORC-1358 Use spotless to format pom files

Xin Zhang added as committer

team

13 Feb 2023

The ORC PMC is happy to add Xin Zhang as an ORC committer for the work on ORC C++ library.

Thank you for your work on ORC, Xin!

ORC 1.7.8 Released

release

21 Jan 2023

The ORC team is excited to announce the release of ORC v1.7.8.

Released: 21 January 2023
Source code: orc-1.7.8.tar.gz
GPG Signature signed by William Hyun (DECDFA29)
Git tag: rel/release-1.7.8
Maven Central: ORC 1.7.8
SHA 256: 4e92db5380d6596e…
Fixed issues: ORC-1.7.8

The improvements:

ORC-1342 Publish SBOM artifacts
ORC-1344 Skip SBOM generation during CMake
ORC-1345 Use makeBom and skip snapshot check in GitHub Action publish_snapshot job

The bug fixes:

ORC-1332 Avoid NegativeArraySizeException when using searchArgument
ORC-1343 Ignore orc.create.index

The test changes:

ORC-1323 Make docker/reinit.sh support target OS arguments

ORC 1.8.2 Released

release

13 Jan 2023

The ORC team is excited to announce the release of ORC v1.8.2.

Released: 13 January 2023
Source code: orc-1.8.2.tar.gz
GPG Signature signed by Dongjoon Hyun (34F0FC5C)
Git tag: rel/release-1.8.2
Maven Central: ORC 1.8.2
SHA 256: 5e14501212abcb73…
Fixed issues: ORC-1.8.2

The bug fixes:

ORC-1332 Avoid NegativeArraySizeException when using searchArgument
ORC-1343 Disable ENABLE_INDEXES

The improvements:

ORC-1327 Exclude the proto files from the nohive jar
ORC-1328 Exclude the proto files from the shaded protobuf jar
ORC-1329 Add OrcConf.getStringAsList method
ORC-1338 Set bloom filter fpp to 1%
ORC-1342 Publish SBOM artifacts
ORC-1344 Skip SBOM generation during CMake
ORC-1345 Use makeBom and skip snapshot check in GitHub Action publish_snapshot job

The test changes:

ORC-1323 Make docker/reinit.sh support target OS arguments
ORC-1330 Add TestOrcConf
ORC-1339 Remove orc.sarg.to.filter default value assumption in test cases
ORC-1350 Upgrade setup-java to v3

The tasks:

ORC-1331 Improve PyArrow page
ORC-1336 Protect .asf.yaml, api, ORC-Deep-Dive-2020.pptx files in website
ORC-1337 Make .htaccess up to date
MINOR: Add .swp to .gitignore
MINOR: Link to Apache ORC orc_proto instead of Hive one
MINOR: Update DOAP file

ORC 1.8.1 Released

release

02 Dec 2022

The ORC team is excited to announce the release of ORC v1.8.1.

Released: 2 December 2022
Source code: orc-1.8.1.tar.gz
GPG Signature signed by Dongjoon Hyun (34F0FC5C)
Git tag: rel/release-1.8.1
Maven Central: ORC 1.8.1
SHA 256: ba5877bd737e1fbc…
Fixed issues: ORC-1.8.1

The bug fixes:

ORC-1283 ENABLE_INDEXES does not take effect
ORC-1288 Invalid memory freeing with ZLIB compression
ORC-1291 NullPointerException in TypeDescription

The improvements:

ORC-1268 Set CMP0135 policy for CMake 3.24+
ORC-1282 Add slf4j impl to avoid warning message
ORC-1294 Build error when skip tests build
ORC-1295 Improve ORC Spec example (Decoding RLE v2 direct)
ORC-1299 benchmark can’t work for data resource 403
ORC-1305 Add more orc java examples
ORC-1308 Avoid star import

The test changes:

ORC-1290 Bump spotbugs to 4.7.3
ORC-1300 Update Spark to 3.3.1 and its dependencies

The tasks:

ORC-1269 Remove FindBugs
ORC-1270 Move opencsv dependency to the tools module
ORC-1292 Add paragraph in java documentation

ORC 1.7.7 Released

release

17 Nov 2022

The ORC team is excited to announce the release of ORC v1.7.7.

Released: 17 November 2022
Source code: orc-1.7.7.tar.gz
GPG Signature signed by Dongjoon Hyun (34F0FC5C)
Git tag: rel/release-1.7.7
Maven Central: ORC 1.7.7
SHA 256: 52cbcd892c0bf07c…
Fixed issues: ORC-1.7.7

The bug fixes:

ORC-1283 ENABLE_INDEXES does not take effect

The test changes:

ORC-1254 Add spotbugs check
ORC-1299 Fix fetch data error in bench module

The tasks:

ORC-1256 Publish tests jar to maven central
ORC-1268 Set CMP0135 policy for CMake 3.24+

The Apache ORC Project Management Committee (PMC) elected William Hyun as the Chair on September 12nd and Apache Software Foundation (ASF) Board approved it and appointed him as Vice President for Apache ORC on September 21st.

William has been leading many areas. He helped Apache ORC PMC add a new member, served as a release manager for 1.7.4/1.7.5/1.7.6/1.8.0, made an important contribution on inter-ASF project collaboration and ORC integration across several projects to help all ORC users, improved ORC infra like ASF ORC DockerHub Setup, docker tests, and GitHub Action, and revamped user experiences through updating websites and Homebrew.

ORC 1.8.0 Released

release

03 Sep 2022

The ORC team is excited to announce the release of ORC v1.8.0.

Released: 3 September 2022
Source code: orc-1.8.0.tar.gz
GPG Signature signed by William Hyun (DECDFA29)
Git tag: rel/release-1.8.0
Maven Central: ORC 1.8.0
SHA 256: 859d78bfded98405…
Fixed issues: ORC-1.8.0

New Feature and Notable Changes:

ORC-450 Support selecting list indices without materializing list items
ORC-824 Add column statistics for List and Map
ORC-1004 Java ORC writer supports the selection vector
ORC-1075 Support reading ORC files with no column statistics
ORC-1125 Support decoding decimals in RLE
ORC-1136 Optimize reads by combining multiple reads without significant separation into a single read
ORC-1138 Seek vs Read Optimization
ORC-1172 Add row count limit config for one stripe
ORC-1212 Upgrade protobuf-java to 3.17.3
ORC-1220 Set min.hadoop.version to 2.7.3
ORC-1248 Redefine Hadoop dependency for Apache ORC 1.8.0
ORC-1256 Publish test-jar to maven central
ORC-1260 Publish shaded-protobuf classifier artifacts

Improvements:

ORC-825 Use Empty Array For Collections toArray
ORC-826 Do Not Use Collection Contains/Get
ORC-828 Improve Fetch Data Set Process
ORC-829 Optimize Serialization percentileBits
ORC-831 Do Not Copy String When Flushing Dictionary
ORC-833 RunLengthIntegerReaderV2 Calculate Batch Size Once
ORC-834 Do Not Convert to String in DecimalFromTimestampTreeReader
ORC-835 Cache TRUE/FALSE Bytes in StringGroupFromBooleanTreeReader
ORC-836 StringGroupFromDoubleTreeReader Use Double toString
ORC-837 Reuse HiveDecimalWritable in ConvertTreeReaderFactory
ORC-838 Simplify compareTo/equals/putBuffer of ByteBufferAllocatorPool
ORC-840 Remove Superfluous Array Fill in RecordReaderImpl
ORC-841 Remove Superfluous Array Fill in StringHashTableDictionary
ORC-842 Remove newKey from StringHashTableDictionary
ORC-844 Improve hashCode Methods
ORC-847 Do Not Create Empty Array in StringGroupFromBinaryTreeReader
ORC-852 Allow DynamicByteArray to Return a ByteBuffer
ORC-853 Optimize writeDouble Implementation
ORC-855 Remove Unused isRepeating from RunLengthIntegerReaderV2
ORC-865 Bump opencsv from 3.9 to 5.5.1
ORC-883 Dependency Audit and QA
ORC-897 optimization loop termination condition in readerIsCompatible method
ORC-935 Bump commons-csv from 1.8 to 1.9.0
ORC-937 Replace deprecated method
ORC-958 Convert command support overwrite option
ORC-969 Evaluate SearchArguments using file and stripe level stats
ORC-975 Avoid double counting closestFixedBits in percentileBits method
ORC-982 Extract checkstyle to a single file, help newcomers check code style
ORC-988 Bump opencsv from 5.5.1 to 5.5.2
ORC-992 Reached max repeat length, we can directly decide to use DELTA encoding
ORC-1005 Make that the java and C++ implementations of determineEncoding in RunLengthIntegerWriterV2 are consistent.
ORC-1007 Fix a warning from the shade plugin
ORC-1013 Renaming a parameter in constructors of TreeWriter’s derived classes
ORC-1014 Add details when we get IOExceptions from file system
ORC-1020 Improve orc::RleDecoderV2::nextDirect
ORC-1027 Filter processing to allow filter injections that cannot be represented via SArgs
ORC-1047 Handle quoted field names during string schema parsing
ORC-1077 Remove commons-codec dependency and use java.util.Base64
ORC-1099 Extend ReadIntent to support MAP and UNION type
ORC-1101 Improve malformed STRUCT handling
ORC-1122 Add buffer to decode the whole run in RleDecoderV2
ORC-1137 Improve float/double conversion in DoubleColumnReader::next()
ORC-1149 Bump slf4j.version to 1.7.36
ORC-1150 Improve RowReaderImpl::computeBatchSize()
ORC-1152 Support encoding short decimals in RLEv2
ORC-1156 Update opencsv to 5.6
ORC-1163 Bump zookeeper from 3.7.0 to 3.8.0
ORC-1169 Use Hadoop 3.3.2 on Java 17+
ORC-1178 Use hadoop 3.3.3 on Java 17+

ORC 1.7.6 Released

release

17 Aug 2022

The ORC team is excited to announce the release of ORC v1.7.6.

Released: 17 August 2022
Source code: orc-1.7.6.tar.gz
GPG Signature signed by William Hyun (DECDFA29)
Git tag: rel/release-1.7.6
Maven Central: ORC 1.7.6
SHA 256: a75e0cccaaf5e03f…
Fixed issues: ORC-1.7.6

The bug fixes:

ORC-1204 ORC MapReduce writer to flush when long arrays
ORC-1205 nextVector should invoke ensureSize when reusing vectors
ORC-1215 Remove a wrong NotNull annotation on value of setAttribute
ORC-1222 Upgrade tools.hadoop.version to 2.10.2
ORC-1227 Use Constructor.newInstance instead of Class.newInstance
ORC-1228 Fix setAttribute to handle null value

The test changes:

ORC-932 Bump byte-buddy from 1.10.19 to 1.11.12 (#842)
ORC-1169 Use Hadoop to 3.3.2 on Java 17+ (#1113)
ORC-1178 Use Hadoop 3.3.3 on Java 17+ (#1129)
ORC-1193 Bump parquet.version to 1.12.3
ORC-1207 Upgrade Spark to 3.3.0
ORC-1210 Upgrade maven to 3.8.6
ORC-1234 Upgrade objenesis to 3.2 in Spark benchmark
ORC-1235 Bump avro.version to 1.11.1
ORC-1240 Update site README to use apache/orc-dev DockerHub image
ORC-1241 Use apache/orc-dev DockerHub repository in Docker tests
ORC-1244 Upgrade byte-buddy to 1.12.13 in branch-1.7
ORC-1245 Use Hadoop 3.3.4 on Java 17+ and benchmark

The documentation changes:

MINOR: Update DOAP with new releases (#1127)
ORC-900 Update doap_orc.rdf for Apache Projects page (#806)
ORC-1231 Update supported OS list in building.md
ORC-1237 Remove a wrong image link to article-footer.png
ORC-1238 Update DOAP with 1.7.5

The tasks:

ORC-1185 Add merge_orc_pr.py
ORC-1187 Use main instead of master in merge_orc_pr.py
ORC-1213 Use https in ThirdpartyToolchain.cmake
ORC-1226 Add a deprecation warning for Hadoop 2.7.2 and below

ORC 1.7.5 Released

release

16 Jun 2022

The ORC team is excited to announce the release of ORC v1.7.5.

Released: 16 June 2022
Source code: orc-1.7.5.tar.gz
GPG Signature signed by William Hyun (DECDFA29)
Git tag: rel/release-1.7.5
Maven Central: ORC 1.7.5
SHA 256: b90cae5853e3ea0e…
Fixed issues: ORC-1.7.5

The bug fixes:

ORC-1151 Fix ColumnWriter for non-UTC Timestamp columns
ORC-1160 Fix seekToRow can’t seek within selected row group
ORC-1133 Fix csv-import tool options
ORC-1183 Upgrade gson to 2.9.0
ORC-1186 Limit family in aarch64 profile
ORC-1188 Fix ORC_PREFER_STATIC_ZLIB

The improvements:

ORC-1198 Add a new PhysicalFsWriter constructor with FSDataOutputStream parameter
ORC-1199 Use Google mirror of Maven Central as the primary

The test changes:

ORC-1155 Add Ubuntu 22.04 to docker tests
ORC-1154 Bump hive.version from 3.1.2 to 3.1.3
ORC-1161 Add MacOS 12 and remove MacOS 10
ORC-1174 Add Ubuntu 22.04 to GitHub Action
ORC-1182 Use slf4j-simple instead of deprecated slf4j-log4j12
ORC-1184 Use Hadoop 3.3.3 in benchmark module
ORC-1189 Update README.md and help command message in benchmark module and .gitignore
ORC-1190 Fix ORCWriterBenchMark dumpDir initialization
ORC-1191 Updated TLC Taxi Benchmark Dataset
ORC-1192 Use orc.zstd instead of orc.none
ORC-1196 Add Spark benchmark integration tests to GHA
ORC-1201 Remove Debian 9 from Docker Tests

The documentation changes:

Add ASF verification instruction link

Pavan Lanka added as committer

team

05 Jun 2022

The ORC PMC is happy to add Pavan Lanka as an ORC committer for the work on introducing LazyIO of non-filter columns and optimizing stripe index and data reads.

Thank you for your work on ORC, Pavan!

ORC adds Yiqun Zhang to PMC

team

08 May 2022

The Apache ORC Project Management Committee (PMC) is happy to announce that Yiqun Zhang has joined us as a new member of the PMC. Yiqun has been showing consistent contributions as a committer, and participated in both major and maintenance releases by actively helping the release managers with testing the release candidates.

Please welcome Yiqun to the ORC PMC!

ORC 1.7.4 Released

release

15 Apr 2022

The ORC team is excited to announce the release of ORC v1.7.4.

Released: 15 April 2022
Source code: orc-1.7.4.tar.gz
GPG Signature signed by William Hyun (DECDFA29)
Git tag: rel/release-1.7.4
Maven Central: ORC 1.7.4
SHA 256: 0a70c5e877b1ff26…
Fixed issues: ORC-1.7.4

The bug fixes:

ORC-1120 Remove C++ library limitation about write version
ORC-1121 Fix column conversion check bug which causes column filters don’t work
ORC-1127 Add missing version of UNSTABLE-PRE-2.0
ORC-1146 Float category missing check if the statistic sum is a finite value
ORC-1147 Use isNaN instead of isFinite to determine the contain NaN values

The improvements:

ORC-236 Support UNION type in Java Convert tool
ORC-1116 Fix csv-import tool when exporting long bytes
ORC-1123 Add estimationMemory method for writer

The test changes:

ORC-1145 Add Java 18 to GitHub Action CI
ORC-1118 Support Java 17 and ARM64 docker tests

The documentation changes:

ORC-1117 Add Dask page at Using in Python section
ORC-1119 Remove timestamp from ORC API docs

ORC 1.6.14 Released

release

14 Apr 2022

The ORC team is excited to announce the release of ORC v1.6.14.

Released: 14 April 2022
Source code: orc-1.6.14.tar.gz
GPG Signature signed by Dongjoon Hyun (34F0FC5C)
Git tag: rel/release-1.6.14
Maven Central: ORC 1.6.14
SHA 256: f2701d27d197a0b4…
Fixed issues: ORC-1.6.14

The bug fixes:

ORC-1121 Fix column coversion check bug which causes column filters don’t work
ORC-1146 Float category missing check if the statistic sum is a finite value
ORC-1147 Use isNaN instead of isFinite to determine the contain NaN values

The ‘tests’ fixes:

ORC-1016 Use openssl@1.1 in GitHub Action MacOS CIs
ORC-1113 Remove CentOS 8 from docker-based tests

Quanlong Huang added as committer

team

05 Mar 2022

The ORC PMC is happy to add Quanlong Huang as an ORC committer for the work on ORC C++ library and Apache Impala integration.

Thank you for your work on ORC, Quanlong!

ORC 1.7.3 Released

release

09 Feb 2022

The ORC team is excited to announce the release of ORC v1.7.3.

Released: 9 February 2022
Source code: orc-1.7.3.tar.gz
GPG Signature signed by Dongjoon Hyun (34F0FC5C)
Git tag: rel/release-1.7.3
Maven Central: ORC 1.7.3
SHA 256: 535c4d7588172e85…
Fixed issues: ORC-1.7.3

The ‘bug’ fixes:

ORC-1060 Reduce memory usage when vectorized reading dictionary string encoding columns
ORC-1065 Fix IndexOutOfBoundsException in ReaderImpl.extractFileTail
ORC-1067 [C++] Upgrade ZSTD to 1.5.1
ORC-1078 Row group end offset doesn’t accommodate all the blocks
ORC-1081 Fix heap-use-after-free in SearchArgumentBuilderImpl::end()
ORC-1087 [C++] Handle unloaded seek positions when seeking in an uncompressed chunk
ORC-1092 [C++] Upgrade LZ4 to version 1.9.3
ORC-1102 [C++] Upgrade ZSTD to 1.5.2

The ‘tools’ improvements:

ORC-1055 [C++] Add the timezone option for the csv-import tool
ORC-1082 Improve FileDump and JsonFileDump to be robust on missing column statistics
ORC-1092 [C++] Support specifying type ids or column names in cpp tools

The ‘documentation’ patches:

ORC-1050 Update ORC site README.md and release process page
ORC-1069 Update building.md
ORC-1071 Update ‘adopters’ page
ORC-1091 Add ‘Tests’ section at ORC ‘develop’ page
ORC-1112 Add ‘Using with Python’ web page
ORC-1114 Update ‘Using with Python’ page with ‘PyArrow’ 7.0.0

The ‘task’ patches:

ORC-1070 Upgrade site docker image to use Ubuntu 20.04
ORC-1072 Add ‘Stale’ GitHub Action job
ORC-1094 Enable GitHub issues tab
ORC-1095 Deprecate ‘UnknownFormatException’

The ‘tests’ fixes:

ORC-875 Add GitHub Action job for Windows Server 2019
ORC-878 Bump auto-service from 1.0-rc7 to 1.0
ORC-881 Bump slf4j.version from 1.7.30 to 1.7.32
ORC-989 Bump checkstyle from 8.45.1 to 9.0
ORC-993 Bump junit.version from 5.7.2 to 5.8.0
ORC-1018 Bump checkstyle from 9.0 to 9.0.1
ORC-1033 Bump junit.version from 5.8.0 to 5.8.1
ORC-1044 Bump reproducible-build-maven-plugin to 0.14
ORC-1048 Bump checkstyle from 9.0.1 to 9.1
ORC-1052 Bump avro.version from 1.10.2 to 1.11.0
ORC-1057 Bump junit.version from 5.8.1 to 5.8.2
ORC-1061 Bump checkstyle from 9.1 to 9.2
ORC-1066 Bump guava from 30.1.1-jre to 31.0.1-jre
ORC-1068 [C++] Stabilize HAS_POST_2038 test
ORC-1073 Remove appveyor.yml
ORC-1076 Remove Travis PR Builder Link from README.md
ORC-1079 Add Linux Clang 11 GitHub Action test coverage
ORC-1080 Remove .travis.yml
ORC-1084 Bump checkstyle from 9.2 to 9.2.1
ORC-1086 Bump reproducible-build-maven-plugin from 0.14 to 0.15
ORC-1090 Disable Clang 13.0-specific compilation warnings
ORC-1093 Remove debian8 specific code in run-one.sh
ORC-1096 Bump slf4j.version to 1.7.33
ORC-1103 Use Maven 3.8.4
ORC-1104 Use Spark 3.2.1 in benchmark
ORC-1105 fetch-data.sh should use zsh instead of bash
ORC-1106 Use transitive commons-lang3 dependency in bench module
ORC-1107 Fix NPE at benchmark data schema loading
ORC-1108 Use RawLocalFileSystem to skip checksum files during benchmark data generation
ORC-1109 Use zstd instead of none in the default compress option
ORC-1111 Bump build-helper-maven-plugin from 3.2.0 to 3.3.0
ORC-1113 Remove CentOS 8 from docker-based tests
ORC-1115 Suppress Illegal reflective access warnings on Java9+ Tests

ORC 1.6.13 Released

release

20 Jan 2022

The ORC team is excited to announce the release of ORC v1.6.13.

Released: 20 January 2022
Source code: orc-1.6.13.tar.gz
GPG Signature signed by Dongjoon Hyun (34F0FC5C)
Git tag: rel/release-1.6.13
Maven Central: ORC 1.6.13
SHA 256: ff69f9e0b5b01dfc…
Fixed issues: ORC-1.6.13

The bug fixes:

ORC-1065 Fix IndexOutOfBoundsException in ReaderImpl.extractFileTail
ORC-1078 Row group end offset doesn’t accommodate all the blocks

The ‘tests’ fixes:

ORC-875 Add GitHub Action job for Windows Server 2019
ORC-941 Move MacOS 10.15/11.5 test from Travis to GitHub Action
ORC-1079 Add Linux Clang 11 GitHub Action test coverage
ORC-1080 Remove .travis.yml

ORC 1.7.2 Released

release

20 Dec 2021

The ORC team is excited to announce the release of ORC v1.7.2.

Released: 20 December 2021
Source code: orc-1.7.2.tar.gz
GPG Signature signed by Dongjoon Hyun (34F0FC5C)
Git tag: rel/release-1.7.2
Maven Central: ORC 1.7.2
SHA 256: ef39bae755116fec…
Fixed issues: ORC-1.7.2

The bug fixes:

ORC-492 Avoid potential ArrayIndexOutOfBoundsException when getting WriterVersionn
ORC-1053 Fix time zone offset precision when convert tool converts LocalDateTime to Timestamp is not consistent with the internal default precision of ORC
ORC-1041 Use memcpy during LZO decompression
ORC-1059 Align findColumns behaviour between 1.6 and 1.7 release

The ‘tools’ improvements:

ORC-1012 Support specifying columns in orc-scan
ORC-1017 Add sizes tool to determine and display the sizes of each column in a set of files
ORC-1023 Support writing bloom filters in ConvertTool

The ‘tests’ fixes:

ORC-915 Remove io.netty.netty from Spark benchmark
ORC-938 Bump netty-all from 4.1.42.Final to 4.1.66.Final
ORC-948 Add hive benchmark integration tests
ORC-957 Bump netty-all from 4.1.66.Final to 4.1.67.Final
ORC-1021 Add -fno-omit-frame-pointer in DEBUG and RELWITHDEBINFO builds
ORC-1051 Update benchmark dependencies

Yiqun Zhang added as committer

team

23 Nov 2021

The ORC PMC is happy to add Yiqun Zhang as an ORC committer for the work on improving ORC tools.

Thank you for your work on ORC, Yiqun!

ORC 1.7.1 Released

release

07 Nov 2021

The ORC team is excited to announce the release of ORC v1.7.1.

Released: 7 November 2021
Source code: orc-1.7.1.tar.gz
GPG Signature signed by Dongjoon Hyun (34F0FC5C)
Git tag: rel/release-1.7.1
Maven Central: ORC 1.7.1
SHA 256: 65d71e571238cbcb…
Fixed issues: ORC-1.7.1

The bug fixes of ORC 1.7:

ORC-879 Flaky Test for TestJsonReader
ORC-1000 Use Java 17 in GitHub Action
ORC-1002 Add java17 profile for Java17 unit testing
ORC-1008 Overflow detection code is incorrect in IntegerColumnStatisticsImpl
ORC-1009 [C++] Missing string include causes build failure with MSVC++
ORC-1010 Bump tzdata from tzdata-2020e-1.tar.xz to tzdata-2021b-1.tar.xz
ORC-1011 Activate java17 profile automatically
ORC-1015 Update OrcFile.WriterOptions::memory javadoc
ORC-1016 Use openssl@1.1 in GitHub Action MacOS CIs
ORC-1024 BloomFilter hash computation is inconsistent between Java and C++ clients
ORC-1029 Could not load ‘org.apache.orc.DataMask.Provider’ when using orc encryption and spark executor with multi cores!
ORC-1030 Java Tools Recover File command does not accurately find OrcFile.MAGIC
ORC-1032 Bump parquet.version from 1.12.0 to 1.12.2
ORC-1034 The search byte array algorithm is incorrectly implemented in FileDump.java
ORC-1035 backupDataPath may be incorrect in recoverFile
ORC-1036 Due to tzdata upgrade, the fixed download links in CI are often not working
ORC-1037 Bump spark.version from 3.1.2 to 3.2.0
ORC-1039 Make FileDump.recoverFile handle side files only if they exist
ORC-1040 Add Debian 11 docker test
ORC-1042 Ignore unused-function C++ compile warning on CentOS 7
ORC-1043 Fix C++ conversion compilation error in CentOS 7

ORC 1.6.12 Released

release

07 Nov 2021

The ORC team is excited to announce the release of ORC v1.6.12.

Released: 7 November 2021
Source code: orc-1.6.12.tar.gz
GPG Signature signed by Dongjoon Hyun (34F0FC5C)
Git tag: rel/release-1.6.12
Maven Central: ORC 1.6.12
SHA 256: ff69f9e0b5b01dfc…
Fixed issues: ORC-1.6.12

The bug fixes of ORC 1.6.12:

ORC-1008 Overflow detection code is incorrect in IntegerColumnStatisticsImpl
ORC-1010 Bump tzdata from tzdata-2020e-1.tar.xz to tzdata-2021b-1.tar.xz
ORC-1024 BloomFilter hash computation is inconsistent between Java and C++ clients
ORC-1029 Could not load ‘org.apache.orc.DataMask.Provider’ when using orc encryption and spark executor with multi cores!
ORC-1034 The search byte array algorithm is incorrectly implemented in FileDump.java
ORC-1035 backupDataPath may be incorrect in recoverFile
ORC-1036 Due to tzdata upgrade, the fixed download links in CI are often not working
ORC-1040 Add Debian 11 docker test
ORC-1042 Ignore unused-function C++ compile warning on CentOS 7
ORC-1043 Fix C++ conversion compilation error in CentOS 7

ORC adds William Hyun to PMC

team

02 Oct 2021

On behalf of the Apache ORC Project Management Committee (PMC), it gives me great pleasure to announce that William Hyun has joined the PMC. William has led several areas including Java 17/Apple Silicon support, Java Tools improvement, Code quality improvement using static analysis, CI/Docker test coverage improvement, and Apache ORC 1.7 migration support at Apache Arrow/Druid/Iceberg.

Please join me in welcoming William to the ORC PMC!

ORC 1.7.0 Released

release

15 Sep 2021

The ORC team is excited to announce the release of ORC v1.7.0.

Released: 15 September 2021
Source code: orc-1.7.0.tar.gz
GPG Signature signed by Dongjoon Hyun (34F0FC5C)
Git tag: rel/release-1.7.0
Maven Central: ORC 1.7.0
SHA 256: cab8cea768b391b1…
Fixed issues: ORC-1.7.0

The new features of ORC 1.7:

ORC-377 Support Snappy compression in C++ Writer
ORC-577 Support row-level filtering
ORC-716 Build and test on Java 17-EA
ORC-731 Improve Java Tools
ORC-742 LazyIO of non-filter columns
ORC-751 Implement Predicate Pushdown in C++ Reader
ORC-755 Introduce OrcFilterContext
ORC-757 Add Hashtable implementation for dictionary
ORC-780 Support LZ4 Compression in C++ Writer
ORC-797 Allow writers to get the stripe information
ORC-818 Build and test in Apple Silicon
ORC-861 Bump CMake minimum requirement to 2.8.12
ORC-867 Upgrade hive-storage-api to 2.8.1
ORC-984 Save the software version that wrote each ORC file

Known issues:

ORC-1002 Add java17 profile

ORC 1.6.11 Released

release

15 Sep 2021

The ORC team is excited to announce the release of ORC v1.6.11.

Released: 15 September 2021
Source code: orc-1.6.11.tar.gz
GPG Signature signed by Dongjoon Hyun (34F0FC5C)
Git tag: rel/release-1.6.11
Maven Central: ORC 1.6.11
SHA 256: 67c17c012bd588fc…
Fixed issues: ORC-1.6.11

The new features of ORC 1.6:

ORC-14 Add column encryption.
ORC-189 Add timestamp with local timezone
ORC-203 Trim minimum and maximum string values
ORC-363 Add zstd support in Java
ORC-397 Support selectively disabling dictionaries
ORC-522 Add type annotations

Known issues:

ORC 1.5.13 Released

release

15 Sep 2021

The ORC team is excited to announce the release of ORC v1.5.13.

Released: 15 September 2021
Source code: orc-1.5.13.tar.gz
GPG Signature signed by Dongjoon Hyun (34F0FC5C)
Git tag: rel/release-1.5.13
Maven Central: ORC 1.5.13
SHA 256: 45274afce558b93f…
Fixed issues: ORC-1.5.13

The new features of ORC 1.5:

ORC-179 Add ORC C++ Writer
ORC-91 Support for variable length blocks in HDFS.
ORC-199 Implement a CSV to ORC converter
ORC-344 Support for using Decimal64ColumnVector
ORC-345 Adding Decimal64StatisticsImpl
ORC-331 Support for building C++ under MSVC.
ORC-234 Support for older versions of Hadoop (>= 2.2.x)
ORC-305 Added statistics for size on disk

Known issues:

ORC 1.6.10 Released

release

10 Aug 2021

The ORC team is excited to announce the release of ORC v1.6.10..

Released: 10 August 2021
Source code: orc-1.6.10.tar.gz
GPG Signature signed by Owen O’Malley (AD1C5877)
Git tag: rel/release-1.6.10
Maven Central: ORC 1.6.10
SHA 256: 3a7347c85d18e44d…
Fixed issues: ORC-1.6.10

The new features of ORC 1.6:

ORC-14 Add column encryption.
ORC-189 Add timestamp with local timezone
ORC-203 Trim minimum and maximum string values
ORC-363 Add zstd support in Java
ORC-397 Support selectively disabling dictionaries
ORC-522 Add type annotations

Known issues:

ORC 1.6.9 Released

release

02 Jul 2021

The ORC team is excited to announce the release of ORC v1.6.9.

Released: 2 July 2021
Source code: orc-1.6.9.tar.gz
GPG Signature signed by Dongjoon Hyun (34F0FC5C)
Git tag: rel/release-1.6.9
Maven Central: ORC 1.6.9
SHA 256: d19af60cd81cdb17…
Fixed issues: ORC-1.6.9

The new features of ORC 1.6:

ORC-14 Add column encryption.
ORC-189 Add timestamp with local timezone
ORC-203 Trim minimum and maximum string values
ORC-363 Add zstd support in Java
ORC-397 Support selectively disabling dictionaries
ORC-522 Add type annotations

Known issues:

ORC 1.6.8 Released

release

21 May 2021

The ORC team is excited to announce the release of ORC v1.6.8.

Released: 21 May 2021
Source code: orc-1.6.8.tar.gz
GPG Signature signed by Dongjoon Hyun (34F0FC5C)
Git tag: rel/release-1.6.8
Maven Central: ORC 1.6.8
SHA 256: 93d2e5f7c9f76ea5…
Fixed issues: ORC-1.6.8

The new features of ORC 1.6:

ORC-14 Add column encryption.
ORC-189 Add timestamp with local timezone
ORC-203 Trim minimum and maximum string values
ORC-363 Add zstd support in Java
ORC-397 Support selectively disabling dictionaries
ORC-522 Add type annotations

Known issues:

William Hyun added as committer

team

13 Apr 2021

The ORC PMC is happy to add William Hyun as an ORC committer for the work on improving ORC’s code quality and integration to Apache Spark and Apache Iceberg.

Thank you for your work on ORC, William!

ORC adds Panagiotis Garefalakis to PMC

team

08 Feb 2021

On behalf of the Apache ORC Project Management Committee (PMC), it gives me great pleasure to announce that Panagiotis Garefalakis has joined the PMC. Panagiotis has radically improved the integration between Hive and ORC.

Please join me in welcoming Panagiotis to the ORC PMC!

ORC 1.6.7 Released

release

22 Jan 2021

The ORC team is excited to announce the release of ORC v1.6.7.

Released: 22 January 2021
Source code: orc-1.6.7.tar.gz
GPG Signature signed by Dongjoon Hyun (34F0FC5C)
Git tag: rel/release-1.6.7
Maven Central: ORC 1.6.7
SHA 256: 93d2e5f7c9f76ea5…
Fixed issues: ORC-1.6.7

The new features of ORC 1.6:

ORC-14 Add column encryption.
ORC-189 Add timestamp with local timezone
ORC-203 Trim minimum and maximum string values
ORC-363 Add zstd support in Java
ORC-397 Support selectively disabling dictionaries
ORC-522 Add type annotations

Known issues:

ORC 1.6.6 Released

release

10 Dec 2020

The ORC team is excited to announce the release of ORC v1.6.6.

Released: 10 December 2020
Source code: orc-1.6.6.tar.gz
GPG Signature signed by Dongjoon Hyun (34F0FC5C)
Git tag: rel/release-1.6.6
Maven Central: ORC 1.6.6
SHA 256: 93d2e5f7c9f76ea5…
Fixed issues: ORC-1.6.6

The new features of ORC 1.6:

ORC-14 Add column encryption.
ORC-189 Add timestamp with local timezone
ORC-203 Trim minimum and maximum string values
ORC-363 Add zstd support in Java
ORC-397 Support selectively disabling dictionaries
ORC-522 Add type annotations

Known issues:

Panagiotis Garefalakis added as committer

team

16 Nov 2020

The ORC PMC is happy to add Panagiotis Garefalakis as an ORC committer for the work on improving ORC’s integration to Apache Hive.

Thank you for your work on ORC, Panagiotis!

ORC 1.6.5 Released

release

01 Oct 2020

The ORC team is excited to announce the release of ORC v1.6.5.

Released: 1 October 2020
Source code: orc-1.6.5.tar.gz
GPG Signature signed by Owen O’Malley (AD1C5877)
Git tag: rel/release-1.6.5
Maven Central: ORC 1.6.5
SHA 256: 1e77840861a5c5c8…
Fixed issues: ORC-1.6.5

The new features of ORC 1.6:

ORC-14 Add column encryption.
ORC-189 Add timestamp with local timezone
ORC-203 Trim minimum and maximum string values
ORC-363 Add zstd support in Java
ORC-397 Support selectively disabling dictionaries
ORC-522 Add type annotations

Known issues:

ORC 1.5.12 Released

release

30 Sep 2020

The ORC team is excited to announce the release of ORC v1.5.12.

Released: 30 September 2020
Source code: orc-1.5.12.tar.gz
GPG Signature signed by Owen O’Malley (AD1C5877)
Git tag: rel/release-1.5.12
Maven Central: ORC 1.5.12
SHA 256: 938e48eca6c83fcd…
Fixed issues: ORC-1.5.12

The new features of ORC 1.5:

ORC-179 Add ORC C++ Writer
ORC-91 Support for variable length blocks in HDFS.
ORC-199 Implement a CSV to ORC converter
ORC-344 Support for using Decimal64ColumnVector
ORC-345 Adding Decimal64StatisticsImpl
ORC-331 Support for building C++ under MSVC.
ORC-234 Support for older versions of Hadoop (>= 2.2.x)
ORC-305 Added statistics for size on disk

Known issues:

ORC 1.6.4 Released

release

14 Sep 2020

The ORC team is excited to announce the release of ORC v1.6.4.

Released: 14 September 2020
Source code: orc-1.6.4.tar.gz
GPG Signature signed by Owen O’Malley (AD1C5877)
Git tag: rel/release-1.6.4
Maven Central: ORC 1.6.4
SHA 256: ceea9849277354cf…
Fixed issues: ORC-1.6.4

The new features of ORC 1.6:

ORC-14 Add column encryption.
ORC-189 Add timestamp with local timezone
ORC-203 Trim minimum and maximum string values
ORC-363 Add zstd support in Java
ORC-397 Support selectively disabling dictionaries
ORC-522 Add type annotations

Known issues:

ORC-667 Positional mapping for nested struct types should not applied by default

ORC 1.5.11 Released

release

14 Sep 2020

The ORC team is excited to announce the release of ORC v1.5.11.

Released: 14 September 2020
Source code: orc-1.5.11.tar.gz
GPG Signature signed by Owen O’Malley (AD1C5877)
Git tag: rel/release-1.5.11
Maven Central: ORC 1.5.11
SHA 256: 636af3a39aa8cdfc…
Fixed issues: ORC-1.5.11

The new features of ORC 1.5:

ORC-179 Add ORC C++ Writer
ORC-91 Support for variable length blocks in HDFS.
ORC-199 Implement a CSV to ORC converter
ORC-344 Support for using Decimal64ColumnVector
ORC-345 Adding Decimal64StatisticsImpl
ORC-331 Support for building C++ under MSVC.
ORC-234 Support for older versions of Hadoop (>= 2.2.x)
ORC-305 Added statistics for size on disk

Known issues:

ORC-667 Positional mapping for nested struct types should not applied by default

ORC 1.5.10 Released

release

26 Apr 2020

The ORC team is excited to announce the release of ORC v1.5.10.

Released: 26 April 2020
Source code: orc-1.5.10.tar.gz
GPG Signature signed by Owen O’Malley (AD1C5877)
Git tag: rel/release-1.5.10
Maven Central: ORC 1.5.10
SHA 256: c2ca21fd2f77afbe…
Fixed issues: ORC-1.5.10

The new features of ORC 1.5:

ORC-179 Add ORC C++ Writer
ORC-91 Support for variable length blocks in HDFS.
ORC-199 Implement a CSV to ORC converter
ORC-344 Support for using Decimal64ColumnVector
ORC-345 Adding Decimal64StatisticsImpl
ORC-331 Support for building C++ under MSVC.
ORC-234 Support for older versions of Hadoop (>= 2.2.x)
ORC-305 Added statistics for size on disk

Known issues:

ORC 1.6.3 Released

release

26 Apr 2020

The ORC team is excited to announce the release of ORC v1.6.3.

Released: 26 April 2020
Source code: orc-1.6.3.tar.gz
GPG Signature signed by Owen O’Malley (AD1C5877)
Git tag: rel/release-1.6.3
Maven Central: ORC 1.6.3
SHA 256: 38b9da9ca771d268…
Fixed issues: ORC-1.6.3

The new features of ORC 1.6:

ORC-14 Add column encryption.
ORC-189 Add timestamp with local timezone
ORC-203 Trim minimum and maximum string values
ORC-363 Add zstd support in Java
ORC-397 Support selectively disabling dictionaries
ORC-522 Add type annotations

Known issues:

ORC 1.5.9 Released

release

30 Jan 2020

The ORC team is excited to announce the release of ORC v1.5.9.

Released: 30 January 2020
Source code: orc-1.5.9.tar.gz
GPG Signature signed by Owen O’Malley (3D0C92B9)
Git tag: rel/release-1.5.9
Maven Central: ORC 1.5.9
SHA 256: 75c534555df8a932…
Fixed issues: ORC-1.5.9

The new features of ORC 1.5:

ORC-179 Add ORC C++ Writer
ORC-91 Support for variable length blocks in HDFS.
ORC-199 Implement a CSV to ORC converter
ORC-344 Support for using Decimal64ColumnVector
ORC-345 Adding Decimal64StatisticsImpl
ORC-331 Support for building C++ under MSVC.
ORC-234 Support for older versions of Hadoop (>= 2.2.x)
ORC-305 Added statistics for size on disk

Known issues:

ORC adds Dongjoon Hyun to PMC

team

09 Dec 2019

On behalf of the Apache ORC Project Management Committee (PMC), it gives me great pleasure to announce that Dongjoon Hyun has joined the PMC. Dongjoon has radically improved the integration between Spark and ORC.

Please join me in welcoming Dongjoon to the ORC PMC!

ORC 1.4.5 Released

release

09 Dec 2019

The ORC team is excited to announce the release of ORC v1.4.5.

Released: 9 December 2019
Source code: orc-1.4.5.tar.gz
GPG Signature signed by Owen O’Malley (3D0C92B9)
Git tag: rel/release-1.4.5
Maven Central: ORC 1.4.5
SHA 256: 6b30272d4c4cbccc…
Fixed issues: ORC-1.4.5

The new features of ORC 1.4:

ORC-72 Add benchmark code for file formats.
ORC-87 Fix timestamp statistics in C++.
ORC-150 Add tool to convert from JSON.
ORC-151 Reduce the size of tools.jar.
ORC-174 Create a nohive variant of the jars.

Known issues:

ORC 1.6.2 Released

release

24 Nov 2019

The ORC team is excited to announce the release of ORC v1.6.2.

Released: 24 November 2019
Source code: orc-1.6.2.tar.gz
GPG Signature signed by Owen O’Malley (3D0C92B9)
Git tag: rel/release-1.6.2
Maven Central: ORC 1.6.2
SHA 256: 5c394603faba3c50…
Fixed issues: ORC-1.6.2

The new features of ORC 1.6:

ORC-14 Add column encryption.
ORC-189 Add timestamp with local timezone
ORC-203 Trim minimum and maximum string values
ORC-363 Add zstd support in Java
ORC-397 Support selectively disabling dictionaries
ORC-522 Add type annotations

Known issues:

ORC 1.5.8 Released

release

24 Nov 2019

The ORC team is excited to announce the release of ORC v1.5.8.

Released: 24 November 2019
Source code: orc-1.5.8.tar.gz
GPG Signature signed by Owen O’Malley (3D0C92B9)
Git tag: rel/release-1.5.8
Maven Central: ORC 1.5.8
SHA 256: 2caf689132168d34…
Fixed issues: ORC-1.5.8

The new features of ORC 1.5:

ORC-179 Add ORC C++ Writer
ORC-91 Support for variable length blocks in HDFS.
ORC-199 Implement a CSV to ORC converter
ORC-344 Support for using Decimal64ColumnVector
ORC-345 Adding Decimal64StatisticsImpl
ORC-331 Support for building C++ under MSVC.
ORC-234 Support for older versions of Hadoop (>= 2.2.x)
ORC-305 Added statistics for size on disk

Known issues:

ORC 1.6.1 Released

release

26 Oct 2019

The ORC team is excited to announce the release of ORC v1.6.1.

Released: 26 October 2019
Source code: orc-1.6.1.tar.gz
GPG Signature signed by Owen O’Malley (3D0C92B9)
Git tag: rel/release-1.6.1
Maven Central: ORC 1.6.1
SHA 256: 56a7622629f0101f…
Fixed issues: ORC-1.6.1

The new features of ORC 1.6:

ORC-14 Add column encryption.
ORC-189 Add timestamp with local timezone
ORC-203 Trim minimum and maximum string values
ORC-363 Add zstd support in Java
ORC-397 Support selectively disabling dictionaries
ORC-522 Add type annotations

Known issues:

ORC-562 Don’t wrap the readerSchema with ACID fields, if it already is
ORC-569 The first index entry may have empty positions
ORC-571 ArrayIndexOutOfBoundsException in StripePlanner.readRowIndex

ORC 1.5.7 Released

release

26 Oct 2019

The ORC team is excited to announce the release of ORC v1.5.7.

Released: 26 October 2019
Source code: orc-1.5.7.tar.gz
GPG Signature signed by Owen O’Malley (3D0C92B9)
Git tag: rel/release-1.5.7
Maven Central: ORC 1.5.7
SHA 256: 0fbc5c6da16be89e…
Fixed issues: ORC-1.5.7

The new features of ORC 1.5:

ORC-179 Add ORC C++ Writer
ORC-91 Support for variable length blocks in HDFS.
ORC-199 Implement a CSV to ORC converter
ORC-344 Support for using Decimal64ColumnVector
ORC-345 Adding Decimal64StatisticsImpl
ORC-331 Support for building C++ under MSVC.
ORC-234 Support for older versions of Hadoop (>= 2.2.x)
ORC-305 Added statistics for size on disk

Known issues:

ORC-562 Don’t wrap the readerSchema with ACID fields, if it already is
ORC-569 The first index entry may have empty positions

ORC 1.6.0 Released

release

03 Sep 2019

The ORC team is excited to announce the release of ORC v1.6.0.

Released: 3 September 2019
Source code: orc-1.6.0.tar.gz
GPG Signature signed by Owen O’Malley (3D0C92B9)
Git tag: rel/release-1.6.0
Maven Central: ORC 1.6.0
SHA 256: 2d864000c60025f5…
Fixed issues: ORC-1.6.0

The new features of ORC 1.6:

ORC-14 Add column encryption.
ORC-189 Add timestamp with local timezone
ORC-203 Trim minimum and maximum string values
ORC-363 Add zstd support in Java
ORC-397 Support selectively disabling dictionaries
ORC-522 Add type annotations

Known issues:

ORC-414 ORC files with malformed protobuf objects can crash C++ reader
ORC-555 IllegalArgumentException when reading files with large footers
ORC-562 Don’t wrap the readerSchema with ACID fields, if it already is
ORC-569 The first index entry may have empty positions
ORC-571 ArrayIndexOutOfBoundsException in StripePlanner.readRowIndex

ORC 1.5.6 Released

release

27 Jun 2019

The ORC team is excited to announce the release of ORC v1.5.6.

Users are advised that as of ORC 1.5.6, ORCReaders that aren’t used to create RecordReaders should be closed.

Released: 27 June 2019
Source code: orc-1.5.6.tar.gz
GPG Signature signed by Owen O’Malley (3D0C92B9)
Git tag: rel/release-1.5.6
Maven Central: ORC 1.5.6
SHA 256: e0588bfd96103bc1…
Fixed issues: ORC-1.5.6

The new features of ORC 1.5:

ORC-179 Add ORC C++ Writer
ORC-91 Support for variable length blocks in HDFS.
ORC-199 Implement a CSV to ORC converter
ORC-344 Support for using Decimal64ColumnVector
ORC-345 Adding Decimal64StatisticsImpl
ORC-331 Support for building C++ under MSVC.
ORC-234 Support for older versions of Hadoop (>= 2.2.x)
ORC-305 Added statistics for size on disk

Known issues:

ORC-525 Users must close ORC Readers after use
ORC-414 ORC files with malformed protobuf objects can crash C++ reader
ORC-562 Don’t wrap the readerSchema with ACID fields, if it already is
ORC-569 The first index entry may have empty positions

Renat Vailiullin and Sandeep More added as committers

team

10 Jun 2019

The ORC PMC is happy to add Renat Vailiullin and Sandeep More as an ORC committers. Renat has done a lot of work to improve the Windows builds and Sandeep has been working on the data masking and statistics.

Thank you for your work on ORC, Renat and Sandeep!

ORC 1.5.5 Released

release

14 Mar 2019

The ORC team is excited to announce the release of ORC v1.5.5.

Released: 14 March 2019
Source code: orc-1.5.5.tar.gz
GPG Signature signed by Owen O’Malley (3D0C92B9)
Git tag: rel/release-1.5.5
Maven Central: ORC 1.5.5
SHA 256: 486bbf0765a5b8c2…
Fixed issues: ORC-1.5.5

The new features of ORC 1.5:

ORC-179 Add ORC C++ Writer
ORC-91 Support for variable length blocks in HDFS.
ORC-199 Implement a CSV to ORC converter
ORC-344 Support for using Decimal64ColumnVector
ORC-345 Adding Decimal64StatisticsImpl
ORC-331 Support for building C++ under MSVC.
ORC-234 Support for older versions of Hadoop (>= 2.2.x)
ORC-305 Added statistics for size on disk

Known issues:

ORC-414 ORC files with malformed protobuf objects can crash C++ reader
ORC-562 Don’t wrap the readerSchema with ACID fields, if it already is
ORC-569 The first index entry may have empty positions

ORC adds Gang Wu to PMC

team

11 Jan 2019

On behalf of the Apache ORC Project Management Committee (PMC), it gives me great pleasure to announce that Gang Wu has joined the PMC. Gang has been doing great work on the C++ code base.

Please join me in welcoming Gang to the ORC PMC!

Dongjoon Hyun added as committer

team

10 Jan 2019

The ORC PMC is happy to add Dongjoon Hyun as an ORC committer for the work on improving ORC’s integration to Spark.

Thank you for your work on ORC, Dongjoon!

ORC 1.5.4 Released

release

21 Dec 2018

vgumashta

The ORC team is excited to announce the release of ORC v1.5.4.

Released: 21 December 2018
Source code: orc-1.5.4.tar.gz
GPG Signature signed by Vaibhav Gumashta (F60037FB)
Git tag: rel/release-1.5.4
Maven Central: ORC 1.5.4
SHA 256: 75cfba40d3574c14…
Fixed issues: ORC-1.5.4

The new features of ORC 1.5:

ORC-179 Add ORC C++ Writer
ORC-91 Support for variable length blocks in HDFS.
ORC-199 Implement a CSV to ORC converter
ORC-344 Support for using Decimal64ColumnVector
ORC-345 Adding Decimal64StatisticsImpl
ORC-331 Support for building C++ under MSVC.
ORC-234 Support for older versions of Hadoop (>= 2.2.x)
ORC-305 Added statistics for size on disk

Known issues:

ORC-414 ORC files with malformed protobuf objects can crash C++ reader
ORC-562 Don’t wrap the readerSchema with ACID fields, if it already is
ORC-569 The first index entry may have empty positions

ORC 1.5.3 Released

release

25 Sep 2018

The ORC team is excited to announce the release of ORC v1.5.3.

Released: 25 September 2018
Source code: orc-1.5.3.tar.gz
GPG Signature signed by Owen O’Malley (3D0C92B9)
Git tag: rel/release-1.5.3
Maven Central: ORC 1.5.3
SHA 256: 96da3cccd2b396dc…
Fixed issues: ORC-1.5.3

The new features of ORC 1.5:

ORC-179 Add ORC C++ Writer
ORC-91 Support for variable length blocks in HDFS.
ORC-199 Implement a CSV to ORC converter
ORC-344 Support for using Decimal64ColumnVector
ORC-345 Adding Decimal64StatisticsImpl
ORC-331 Support for building C++ under MSVC.
ORC-234 Support for older versions of Hadoop (>= 2.2.x)
ORC-305 Added statistics for size on disk

Known issues:

ORC-414 ORC files with malformed protobuf objects can crash C++ reader
ORC-562 Don’t wrap the readerSchema with ACID fields, if it already is
ORC-569 The first index entry may have empty positions

ORC 1.5.2 Released

release

29 Jun 2018

prasanthj

The ORC team is excited to announce the release of ORC v1.5.2.

Released: 29 June 2018
Source code: orc-1.5.2.tar.gz
GPG Signature signed by Prasanth Jayachandran (65C468A3)
Git tag: rel/release-1.5.2
Maven Central: ORC 1.5.2
SHA 256: 4b73de720f54448d…
Fixed issues: ORC-1.5.2

The new features of ORC 1.5:

ORC-179 Add ORC C++ Writer
ORC-91 Support for variable length blocks in HDFS.
ORC-199 Implement a CSV to ORC converter
ORC-344 Support for using Decimal64ColumnVector
ORC-345 Adding Decimal64StatisticsImpl
ORC-331 Support for building C++ under MSVC.
ORC-234 Support for older versions of Hadoop (>= 2.2.x)
ORC-305 Added statistics for size on disk

Known issues:

ORC-414 ORC files with malformed protobuf objects can crash C++ reader
ORC-562 Don’t wrap the readerSchema with ACID fields, if it already is
ORC-569 The first index entry may have empty positions

ORC 1.5.1 Released

release

25 May 2018

The ORC team is excited to announce the release of ORC v1.5.1.

Released: 25 May 2018
Source code: orc-1.5.1.tar.gz
GPG Signature signed by Owen O’Malley (3D0C92B9)
Git tag: rel/release-1.5.1
Maven Central: ORC 1.5.1
SHA 256: 14b93916ac6dce65…
Fixed issues: ORC-1.5.1

The new features of ORC 1.5:

ORC-179 Add ORC C++ Writer
ORC-91 Support for variable length blocks in HDFS.
ORC-199 Implement a CSV to ORC converter
ORC-344 Support for using Decimal64ColumnVector
ORC-345 Adding Decimal64StatisticsImpl
ORC-331 Support for building C++ under MSVC.
ORC-234 Support for older versions of Hadoop (>= 2.2.x)
ORC-305 Added statistics for size on disk

Known issues:

ORC-414 ORC files with malformed protobuf objects can crash C++ reader
ORC-562 Don’t wrap the readerSchema with ACID fields, if it already is
ORC-569 The first index entry may have empty positions

ORC 1.5.0 Released

release

14 May 2018

The ORC team is excited to announce the release of ORC v1.5.0.

Released: 14 May 2018
Source code: orc-1.5.0.tar.gz
GPG Signature signed by Owen O’Malley (3D0C92B9)
Git tag: rel/release-1.5.0
Maven Central: ORC 1.5.0
SHA 256: 28369ea8e24cac6d…
Fixed issues: ORC-1.5.0

The new features of ORC 1.5:

ORC-179 Add ORC C++ Writer
ORC-91 Support for variable length blocks in HDFS.
ORC-199 Implement a CSV to ORC converter
ORC-344 Support for using Decimal64ColumnVector
ORC-345 Adding Decimal64StatisticsImpl
ORC-331 Support for building C++ under MSVC.
ORC-234 Support for older versions of Hadoop (>= 2.2.x)
ORC-305 Added statistics for size on disk

Known issues:

ORC-367 Boolean columns are read incorrectly when using seek.
ORC-414 ORC files with malformed protobuf objects can crash C++ reader
ORC-562 Don’t wrap the readerSchema with ACID fields, if it already is
ORC-569 The first index entry may have empty positions

ORC 1.4.4 Released

release

14 May 2018

The ORC team is excited to announce the release of ORC v1.4.4.

Released: 14 May 2018
Source code: orc-1.4.4.tar.gz
GPG Signature signed by Owen O’Malley (3D0C92B9)
Git tag: rel/release-1.4.4
Maven Central: ORC 1.4.4
SHA 256: 9df0f59ba4046d2a…
Fixed issues: ORC-1.4.4

The new features of ORC 1.4:

ORC-72 Add benchmark code for file formats.
ORC-87 Fix timestamp statistics in C++.
ORC-150 Add tool to convert from JSON.
ORC-151 Reduce the size of tools.jar.
ORC-174 Create a nohive variant of the jars.

Known issues:

Gang Wu and Xiening Dai added as committer

team

27 Mar 2018

The ORC PMC is happy to add Gang Wu and Xiening Dai as ORC committers for their work on the C++ ORC writer.

Thank you for your work on ORC, Gang and Xiening!

ORC 1.4.3 Released

release

09 Feb 2018

The ORC team is excited to announce the release of ORC v1.4.3.

Released: 9 February 2018
Source code: orc-1.4.3.tar.gz
GPG Signature signed by Owen O’Malley (3D0C92B9)
Git tag: rel/release-1.4.3
Maven Central: ORC 1.4.3
SHA 256: 0310d6ed20d95b7c…
Fixed issues: ORC-1.4.3

The new features of ORC 1.4:

ORC-72 Add benchmark code for file formats.
ORC-87 Fix timestamp statistics in C++.
ORC-150 Add tool to convert from JSON.
ORC-151 Reduce the size of tools.jar.
ORC-174 Create a nohive variant of the jars.

Known issues:

CVE-2018-8015 ORC files with malformed types cause stack overflow.

ORC 1.4.2 Released

release

23 Jan 2018

The ORC team is excited to announce the release of ORC v1.4.2.

Released: 23 January 2018
Source code: orc-1.4.2.tar.gz
GPG Signature signed by Owen O’Malley (3D0C92B9)
Git tag: rel/release-1.4.2
Maven Central: ORC 1.4.2
SHA 256: 4c32e30a2b93953c…
Fixed issues: ORC-1.4.2

The new features of ORC 1.4:

ORC-72 Add benchmark code for file formats.
ORC-87 Fix timestamp statistics in C++.
ORC-150 Add tool to convert from JSON.
ORC-151 Reduce the size of tools.jar.
ORC-174 Create a nohive variant of the jars.

Known issues:

CVE-2018-8015 ORC files with malformed types cause stack overflow.
ORC-285 Empty vector batches of floats or doubles cause EOFException

ORC 1.4.1 Released

release

16 Oct 2017

prasanthj

The ORC team is excited to announce the release of ORC v1.4.1.

Released: 16 October 2017
Source code: orc-1.4.1.tar.gz
GPG Signature signed by Prasanth Jayachandran (65C468A3)
Git tag: rel/release-1.4.1
Maven Central: ORC 1.4.1
SHA 256: bf9f107c61ecd6a9…
Fixed issues: ORC-1.4.1

The new features of ORC 1.4:

ORC-72 Add benchmark code for file formats.
ORC-87 Fix timestamp statistics in C++.
ORC-150 Add tool to convert from JSON.
ORC-151 Reduce the size of tools.jar.
ORC-174 Create a nohive variant of the jars.

Known issues:

CVE-2018-8015 ORC files with malformed types cause stack overflow.
ORC-285 Empty vector batches of floats or doubles cause EOFException

ORC 1.3.4 Released

release

16 Oct 2017

prasanthj

The ORC team is excited to announce the release of ORC v1.3.4.

Released: 16 October 2017
Source code: orc-1.3.4.tar.gz
GPG Signature signed by Prasanth Jayachandran (65C468A3)
Git tag: rel/release-1.3.4
Maven Central: ORC 1.3.4
SHA 256: 55269430aea7b825…
Fixed issues: ORC-1.3.4

The new features of ORC 1.3:

ORC-58 Split C++ Reader into Reader and RowReader
ORC-120 Add backwards compatibility mode for schema evolution.
ORC-124 Fast decimal improvements
ORC-128 Add ability to get statistics from writer

Known issues:

CVE-2018-8015 ORC files with malformed types cause stack overflow.
ORC-285 Empty vector batches of floats or doubles cause EOFException

ORC adds Eugene and Deepak to PMC

team

06 Sep 2017

On behalf of the Apache ORC Project Management Committee (PMC), it gives me great pleasure to announce that Eugene Koifman and Deepak Majeti have joined the PMC. Eugene has been critical working on ACID and Deepak has been doing great work on the C++ code base.

Please join me in welcoming Eugene and Deepak to the ORC PMC!

Deepak Majeti added as committer

team

16 May 2017

The ORC PMC is happy to add Deepak Majeti as an ORC committer for the work on the C++ ORC reader including both contributions and reviews of other’s patches. Thank you for your work on ORC, Deepak!

ORC 1.4.0 Released

release

08 May 2017

The ORC team is excited to announce the release of ORC v1.4.0.

Released: 8 May 2017
Source code: orc-1.4.0.tar.gz
GPG Signature signed by Owen O’Malley (3D0C92B9)
Git tag: rel/release-1.4.0
Maven Central: ORC 1.4.0
SHA 256: 0f96b2096dd053b6…
Fixed issues: ORC-1.4.0

The new features of ORC 1.4:

ORC-72 Add benchmark code for file formats.
ORC-87 Fix timestamp statistics in C++.
ORC-150 Add tool to convert from JSON.
ORC-151 Reduce the size of tools.jar.
ORC-174 Create a nohive variant of the jars.

Known issues:

CVE-2018-8015 ORC files with malformed types cause stack overflow.
ORC-285 Empty vector batches of floats or doubles cause EOFException

ORC 1.3.3 Released

release

21 Feb 2017

The ORC team is excited to announce the release of ORC v1.3.3.

Released: 21 February 2017
Source code: orc-1.3.3.tar.gz
GPG Signature signed by Owen O’Malley (3D0C92B9)
Git tag: rel/release-1.3.3
Maven Central: ORC 1.3.3
SHA 256: 48cf9f47ab13f4ba…
Fixed issues: ORC-1.3.3

The new features of ORC 1.3:

ORC-58 Split C++ Reader into Reader and RowReader
ORC-120 Add backwards compatibility mode for schema evolution.
ORC-124 Fast decimal improvements
ORC-128 Add ability to get statistics from writer

Known issues:

CVE-2018-8015 ORC files with malformed types cause stack overflow.
ORC-285 Empty vector batches of floats or doubles cause EOFException

ORC 1.3.2 Released

release

13 Feb 2017

The ORC team is excited to announce the release of ORC v1.3.2.

Released: 13 February 2017
Source code: orc-1.3.2.tar.gz
GPG Signature signed by Owen O’Malley (3D0C92B9)
Git tag: rel/release-1.3.2
Maven Central: ORC 1.3.2
SHA 256: 929b70f63e2caf3e…
Fixed issues: ORC-1.3.2

The new features of ORC 1.3:

ORC-58 Split C++ Reader into Reader and RowReader
ORC-120 Add backwards compatibility mode for schema evolution.
ORC-124 Fast decimal improvements
ORC-128 Add ability to get statistics from writer

Known issues:

CVE-2018-8015 ORC files with malformed types cause stack overflow.
ORC-285 Empty vector batches of floats or doubles cause EOFException

ORC 1.3.1 Released

release

03 Feb 2017

The ORC team is excited to announce the release of ORC v1.3.1.

Released: 3 February 2017
Source code: orc-1.3.1.tar.gz
GPG Signature signed by Owen O’Malley (3D0C92B9)
Git tag: rel/release-1.3.1
Maven Central: ORC 1.3.1
SHA 256: d16c55f20f9fe217…
Fixed issues: ORC-1.3.1

The new features of ORC 1.3:

ORC-58 Split C++ Reader into Reader and RowReader
ORC-120 Add backwards compatibility mode for schema evolution.
ORC-124 Fast decimal improvements
ORC-128 Add ability to get statistics from writer

Known issues:

CVE-2018-8015 ORC files with malformed types cause stack overflow.
ORC-135 Predicate push down is incorrect on timestamps when moved between timezones
ORC-285 Empty vector batches of floats or doubles cause EOFException

ORC 1.3.0 Released

release

23 Jan 2017

The ORC team is excited to announce the release of ORC v1.3.0.

Released: 23 January 2017
Source code: orc-1.3.0.tar.gz
GPG Signature signed by Owen O’Malley (3D0C92B9)
Git tag: rel/release-1.3.0
Maven Central: ORC 1.3.0
SHA 256: d19a5b5cc1df5797…
Fixed issues: ORC-1.3.0

The new features of ORC 1.3:

ORC-58 Split C++ Reader into Reader and RowReader
ORC-120 Add backwards compatibility mode for schema evolution.
ORC-124 Fast decimal improvements
ORC-128 Add ability to get statistics from writer

Known issues:

ORC-135 Predicate push down is incorrect on timestamps when moved between timezones
ORC-285 Empty vector batches of floats or doubles cause EOFException

ORC adds Gopal Vijayaraghavan to PMC

team

04 Jan 2017

On behalf of the Apache ORC Project Management Committee (PMC), it gives me great pleasure to announce that Gopal Vijayaraghavan has joined the PMC. Gopal has done an amazing job at speeding up ORC in many ways.

Please join me in welcoming Gopal to the ORC PMC!

Congratulations Gopal!

ORC adds new committers

team

15 Dec 2016

As part of the removal of the ORC code base from Hive, the ORC PMC has offered to make any existing Hive committers into ORC committers. The new ORC committers coming from Hive are:

Aihua Xu
Ashutosh Chauhan
Carl Steinbach
Chaoyu Tang
Chinna Rao Lalam
Daniel Dai
Eugene Koifman
Ferdinand Xu
Jason Dere
Jesus Camacho Rodriguez
Jimmy Xiang
Lars Francke
Matthew McCline
Mithun Radhakrishnan
Naveen Gangam
Pengcheng Xiong
Rajesh Balamohan
Rui Li
Sergio Pena
Siddharth Seth
Vaibhav Gumashta
Wei Zheng
Yongzhi Chen

ORC 1.2.3 Released

release

12 Dec 2016

The ORC team is excited to announce the release of ORC v1.2.3. This release fixes some bugs in the Java schema evolution code.

Released: 12 December 2016
Source code: orc-1.2.3.tar.gz
GPG Signature signed by Owen O’Malley (3D0C92B9)
Git tag: rel/release-1.2.3
Maven Central: ORC 1.2.3
SHA 256: a86a335052553bc5…
Fixed issues: ORC-1.2.3

The new features of ORC 1.2:

ORC-54 Evolve schemas based on field name rather than index
ORC-84 Create a separate java tool module.
ORC-77 and ORC-81 Implement LZO and LZ4 compression codecs.
ORC-92 Add support for nested column id selection in C++
ORC-69 Add batch option support in orc-scan tools.

Important fixes:

HIVE-14214 ORC schema evolution and predicate push down do not work together.

Known issues:

CVE-2018-8015 ORC files with malformed types cause stack overflow.
ORC-135 Predicate push down is incorrect on timestamps when moved between timezones
ORC-285 Empty vector batches of floats or doubles cause EOFException

ORC 1.2.2 Released

release

01 Dec 2016

The ORC team is excited to announce the release of ORC v1.2.2.

Released: 1 December 2016
Source code: orc-1.2.2.tar.gz
GPG Signature signed by Owen O’Malley (3D0C92B9)
Git tag: rel/release-1.2.2
Maven Central: ORC 1.2.2
SHA 256: 6aa87390f0f03c43…
Fixed issues: ORC-1.2.2

The new features of ORC 1.2:

ORC-54 Evolve schemas based on field name rather than index
ORC-84 Create a separate java tool module.
ORC-77 and ORC-81 Implement LZO and LZ4 compression codecs.
ORC-92 Add support for nested column id selection in C++
ORC-69 Add batch option support in orc-scan tools.

Important fixes:

HIVE-14214 ORC schema evolution and predicate push down do not work together.

Known issues:

CVE-2018-8015 ORC files with malformed types cause stack overflow.
ORC-135 Predicate push down is incorrect on timestamps when moved between timezones
ORC-285 Empty vector batches of floats or doubles cause EOFException

ORC 1.2.1 Released

release

05 Oct 2016

The ORC team is excited to announce the release of ORC v1.2.1.

Released: 5 October 2016
Source code: orc-1.2.1.tar.gz
GPG Signature signed by Owen O’Malley (3D0C92B9)
Git tag: rel/release-1.2.1
Maven Central: ORC 1.2.1
SHA 256: 793bcc0419574fba…
Fixed issues: ORC-1.2.1

The new features of ORC 1.2:

ORC-54 Evolve schemas based on field name rather than index
ORC-84 Create a separate java tool module.
ORC-77 and ORC-81 Implement LZO and LZ4 compression codecs.
ORC-92 Add support for nested column id selection in C++
ORC-69 Add batch option support in orc-scan tools.

Important fixes:

HIVE-14214 ORC schema evolution and predicate push down do not work together.

Known issues:

CVE-2018-8015 ORC files with malformed types cause stack overflow.
ORC-135 Predicate push down is incorrect on timestamps when moved between timezones
ORC-285 Empty vector batches of floats or doubles cause EOFException

ORC 1.2.0 Released

release

25 Aug 2016

The ORC team is excited to announce the release of ORC v1.2.0.

Released: 25 August 2016
Source code: orc-1.2.0.tar.gz
GPG Signature signed by Owen O’Malley (3D0C92B9)
Git tag: rel/release-1.2.0
Maven Central: ORC 1.2.0
SHA 256: 5c394c7ed3a31d20…
Fixed issues: ORC-1.2.0

The new features of ORC 1.2:

ORC-54 Evolve schemas based on field name rather than index
ORC-84 Create a separate java tool module.
ORC-77 and ORC-81 Implement LZO and LZ4 compression codecs.
ORC-92 Add support for nested column id selection in C++
ORC-69 Add batch option support in orc-scan tools.

Important fixes:

HIVE-14214 ORC schema evolution and predicate push down do not work together.

Known issues:

CVE-2018-8015 ORC files with malformed types cause stack overflow.
ORC-101 Bloom filters for string and decimal use inconsistent encoding
ORC-135 Predicate push down is incorrect on timestamps when moved between timezones
ORC-285 Empty vector batches of floats or doubles cause EOFException

ORC 1.1.2 Released

release

08 Jul 2016

The ORC team is excited to announce the release of ORC v1.1.2. This release contains the Java reader and writer and the native C++ ORC reader and tools.

Released: 8 July 2016
Source code: orc-1.1.2.tgz
GPG Signature signed by Owen O’Malley (3D0C92B9)
Git tag: rel/release-1.1.2
Maven Central: ORC 1.1.2
SHA 256: 5d14df7d48126dd8…
Fixed issues: ORC-1.1.2

The major new features in ORC 1.1 are:

ORC-1 Copy the Java ORC code from Hive.
ORC-10 Fix the C++ reader to correctly read timestamps from timezones with different daylight savings rules.
ORC-52 Add mapred and mapreduce connectors.

Known issues:

CVE-2018-8015 ORC files with malformed types cause stack overflow.
HIVE-14214 Schema evolution and predicate pushdown don’t work together.
ORC-101 Bloom filters for string and decimal use inconsistent encoding
ORC-135 Predicate push down is incorrect on timestamps when moved between timezones
ORC-285 Empty vector batches of floats or doubles cause EOFException

File format benchmark

talk

28 Jun 2016

File Format Benchmarks: Avro, JSON, ORC, & Parquet

I gave a talk at Hadoop Summit San Jose 2016 about a file format benchmark that I’ve contributed as ORC-72. The benchmark focuses on real data sets that are publicly available. The data sets represent a wide variety of use cases:

NYC Taxi Data - very dense data with mostly numeric types
Github Archives - very sparse data with a lot of complex structure
Sales - a real production schema from a sales table with a synthetic generator

The benchmarks look at a set of three very common use cases:

Full table scan - read all columns and rows
Column projection - read some columns, but all of the rows
Column projection and predicate push down - read some columns and some rows

You can see the slides here:

ORC 1.1.1 Released

release

13 Jun 2016

The ORC team is excited to announce the release of ORC v1.1.1. This release contains the Java reader and writer and the native C++ ORC reader and tools.

Released: 13 June 2016
Source code: orc-1.1.1.tgz
GPG Signature signed by Owen O’Malley (3D0C92B9)
Git tag: rel/release-1.1.1
Maven Central: ORC 1.1.1
SHA 256: 19292a1848672c9c…
Fixed issues: ORC-1.1.1

The major new features in ORC 1.1 are:

ORC-1 Copy the Java ORC code from Hive.
ORC-10 Fix the C++ reader to correctly read timestamps from timezones with different daylight savings rules.
ORC-52 Add mapred and mapreduce connectors.

Known issues:

CVE-2018-8015 ORC files with malformed types cause stack overflow.
HIVE-14214 Schema evolution and predicate pushdown don’t work together.
ORC-101 Bloom filters for string and decimal use inconsistent encoding
ORC-135 Predicate push down is incorrect on timestamps when moved between timezones
ORC-285 Empty vector batches of floats or doubles cause EOFException

ORC 1.1.0 Released

release

10 Jun 2016

The ORC team is excited to announce the release of ORC v1.1.0. This release contains the Java reader and writer and the native C++ ORC reader and tools.

Release Artifacts:

Released: 10 June 2016
Source code: orc-1.1.0.tgz
GPG Signature signed by Owen O’Malley (3D0C92B9)
Git tag: rel/release-1.1.0
Maven Central: ORC 1.1.0
SHA 256: 8beea2be064baf37…
Fixed issues: ORC-1.1.0

The major new features in ORC 1.1 are:

ORC-1 Copy the Java ORC code from Hive.
ORC-10 Fix the C++ reader to correctly read timestamps from timezones with different daylight savings rules.
ORC-52 Add mapred and mapreduce connectors.

Known issues:

CVE-2018-8015 ORC files with malformed types cause stack overflow.
HIVE-14214 Schema evolution and predicate pushdown don’t work together.
ORC-101 Bloom filters for string and decimal use inconsistent encoding
ORC-135 Predicate push down is incorrect on timestamps when moved between timezones
ORC-285 Empty vector batches of floats or doubles cause EOFException

ORC 1.0.0 Released

release

25 Jan 2016

The ORC team is excited to announce the release of ORC v1.0.0. This release contains the native C++ ORC reader and some tools.

Released: 25 January 2016
Source code: orc-1.0.0.tgz
GPG Signature signed by Owen O’Malley (3D0C92B9)
Git tag: rel/release-1.0.0
Maven Central: ORC 1.0.0
SHA 256: 8ad5111f0ca3b72f…
Fixed issues: ORC-1.0.0

The major features:

Portable pure C++ ORC reader
The C++ reader is known to work on:
- CentOS and RHEL 5, 6, and 7
- Debian 6 and 7
- Ubuntu 12 and 14
- Mac OS 10.10 and 10.11
A file-contents command that prints the contents of the file as json records.
A file-metadata command that prints the metadata of the file.
Docker files for building and testing on various Linux distributions.
Memory estimation for the reader.

Known issues:

CVE-2018-8015 ORC files with malformed types cause stack overflow.
ORC-10 When moving ORC files between timezones, different daylight savings rules will cause timestamps to shift in the C++ reader.

ORC adds Aliaksei Sandryhaila to PMC

team

19 Nov 2015

On behalf of the Apache ORC Project Management Committee (PMC), it gives me great pleasure to announce that Aliaksei Sandryhaila has joined the Apache ORC PMC. He has done lot of good work on ORC and I’m looking forward to more.

Please join me in welcoming Aliaksei to ORC PMC!

Congratulations Aliaksei!

ORC adopts new logo

project

26 Jun 2015

The ORC project has adopted a new logo. We hope you like it.

Other great options included a big white hand on a black shield. smile

ORC adds 7 committers

team

11 May 2015

The ORC project management committee today added seven new committers for their work on ORC. Welcome all!

Gunther Hagleitner
Aliaksei Sandryhaila
Sergey Shelukhin
Gopal Vijayaraghavan
Stephen Walkauskas
Kevin Wilfong
Xuefu Zhang

ORC becomes an Apache Top Level Project

project

22 Apr 2015