2.6.0 Release Notes (Feb 4, 2023)
TABLE OF CONTENTS
Installation scripts
On premise (click to download)
Step 1) Download installation resources
wget -O conduit-2.6.0.tar.gz 'https://bpartifactstorage.blob.core.windows.net/conduit-artifacts-public/2.6.0/conduit-2.6.0.tar.gz'
Step 2) Download installation manager
wget -O conduit-transformless-install-onprem.sh 'https://bpartifactstorage.blob.core.windows.net/conduit-artifacts-public/2.6.0/conduit-transformless-install-onprem.sh'
Step 3) Follow instructions from Installing on VM (any cloud or on-premise services)
Release summary
We are proud to announce the next release of Conduit with the following major features and improvements:
SFTP Connector
We are excited to announce a new file connector: SFTP.
With this release you are now able to use Conduit to migrate, query and virtualize your datasets that sit behind SFTP servers. SFTP connector is production ready.
Materialization Destinations
Another major feature that we are proud to introduce is the Materialization Destinations.
This introduces support for materializing data (i.e. caching data) to different destinations per connector.
Data Store
Parquet store is renamed to Data store.
High Availability
With this release Conduit is able to run in High Availability mode. This means that you can run multiple Conduit instances behind a load balancer in order to provide system resilience.
See more information regarding Conduit deployment modes at Conduit Deployment Diagrams.
Conduit can be run in High Availability modes in the following scenarios:
on premise
Google Cloud
Azure
Please contact us for support regarding installing and running Conduit in High Availability mode.
AWS RDS Connectors
To improve user experience and facilitate the usage of cloud connectors without needing to worry about complex configurations related to the cloud provider, each supported database will have its dedicated connector type for the corresponding cloud provider.
With this release we introduce the AWS RDS databases:
AWS mssql
AWS mysql
AWS oracle
AWS postgres
AWS mariaDB
AWS Aurora MySQL
AWS Aurora Postgres
Complete release notes
SFTP Connector
[CONDUITV3-1470] SFTP connector (#2595)
Data store
Data store was previously known as Parquet Store.
[CONDUITV3-XYZ] TriggeredBy in materialization audit is always null
[CONDUITV3-1845] Restart materialization in data store caused loop reset (#2577)
[CONDUITV3-1880] Change endpoint path from "/parquet-cache" to "/parquet-store" (#2597)
[CONDUITV3-1877] Entry in parquet store does not change status to 'Not Started' when virtual dataset is removed, if "materialize now" was enabled
[CONDUITV3-785] Query cache fails to expire when "Query Caching" is unchecked if table names have capital letters
Materialization Destinations
Each Destination has also a configurable option to select which data storage format to use when materializing the dataset.
These storage formats are:
parquet
delta table
In most cases, the Destination will contain data from one dataset. Destination is intended to be a used by other systems as input. Therefore, is highly unlikely to store multiple data formats in the same folder. Each Destination in practice will be logically independent to other Destinations with regards to data content.
Destination types:
Azure Blob
Azure Government
AWS S3
Google Cloud
File System
Materialization Destination is configurable per connector if Connector Materialization is enabled.
This Destination will be used by default for all dataset tables selected at the previous step.
In addition, as an advanced configuration, the user can also override the Destination setting for each table in particular at the Advanced step in the Connector configuration wizzard.
High Availability
Service Management
[CONDUITV3-1185] -[UI] - Individual VM nodes api changes and cluster view with disabled restart/check status functionality (#2236)
[CONDUITV3-843] - Conduit service High Availability - Clustered mode (#2177)
[CONDUITV3-1177] - Service Management error displayed at first login on a clean server (#2257)
[CONDUITV3-1050] - Implement the monitoring of VMs and services as a cluster from Conduit (#2237)
[CONDUITV3-1524] - [ServiceManagement] Conduit Master - display spark and hdfs namenode status when available (#2393)
[CONDUITV3-1524] [ServiceManagement] Conduit Master - display spark and hdfs namenode status when available (#2386)
[CONDUITV3-1050] Fix api for monitoring a specific service from the cluster (#2281)
[CONDUITV3-1525] - [ServiceManagement] Conduit Worker - display spark worker and hdfs datanode status when available (#2404)
[CONDUITV3-1503] - [UI] display which Conduit VM is the leader in the serviceManagement view
Arrow Flight Server
[CONDUITV3-1097] - Implement running queries through Arrow FlightServer for each connector when Spark queries encountered on Slave nodes (#2262)
Installation
[CONDUITV3-1261] update in agent-service the default value of conduit_fs_type for existing and new production deployment
[CONDUITV3-1496] Create Azure resources then automatically start installing Conduit (#2412)
Data Catalog
[CONDUITV3-1490] - [HA] Data catalog assets are processed by all the nodes in a ha environment (#2402)
Error Handling
[CONDUITV3-1512] - [UI] improve 502 error message when Conduit is in HA mode (#2464)
Miscellaneous
[CONDUITV3-xyz] Rename DB_NAME to CONNECTOR_NAME in Parquet API Swagger
[CONDUITV3-1886] "Connection pool shutdown" error on s3 configuration
[CONDUITV3-1851] fixes "Failed to update query audit for query" error during query execution
[CONDUITV3-1878] Duration of caching reported in logs differs from the one in Data Store status
[CONDUITV3-1264] - Separate Aurora connectors in UI for mysql and postgres libraries (#2607)
[CONDUITV3-1940] - Display cancel dialog for metadata query (explore query) after 5 seconds (#2606)
[CONDUITV3-1914] Support materialization destination per connector (#2600)
Related articles