Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Conduit makes it easy to connect your data to your favorite BI and data science tools, including Power BI. Your data approachable and interactive – in a matter of minutes, no matter where it's stored.

  • Data aggregation and JOINs 

  • Access your data in real-time. Conduit allows you to connector in DirectQuery mode vs. Power BI’s standard import mode, which limits your data refreshes per day. 

  • Advanced Parquet Store cache for a fast performance. Configurable expiration and re-caching.

  • Custom pick data to use only specific columns needed for reporting to speed things up even more.

  • Built-in data governance and security controls. Flexible yet robust.

...

On this page:

...

Features

...

Prerequisites

Create Connector

...

Datasources

...

Authentication

...

Publish

...

Virtualization

...

Authorization

...

Advanced

...

Table of Contents
minLevel1
maxLevel6
outlinefalse
styledisc
typelist
printablefalse

Prerequisites

If you haven’t already done so, be sure to sign up for a Conduit account.  Try the power and flexibility of Conduit firsthand with a free trial.

...

To cancel connector creation, click Close button.

...

Authentication

Define how external BI users should be authorized by Conduit to access specific data and how Conduit is connecting to the datasource.

...

To cancel connector creation, click Close button.

...

Virtualization

On Virtualization tab you can configure the following:

  • Enable Query Caching

    • When enabled, Conduit will store query results for all queries for the connector's datasets so that when the exact same query is called again, the query results will be returned from memory

    • The results set exceeding one page of retrieved records - for PowerBI it's 10000 - will not be cached to avoid OOM

    • Recommended to enable when expensive queries are expected and/or when underlying data is not expected to change often

    • Caching expiration is 24 hours by default, and can be customized for each connector's dataset as needed

  • Enable Connector Caching

    • When enabled, Conduit will create temporary secure parquet store of all connector's datasets for a quick future access

    • Recommended to enable for large datasets and/or when expensive queries are expected  

    • Selected tables for the connector will be cached in the parquet store. All queries for this connector will be ran against the parquet store

    • Caching expiration is 24 hours by default, and can be customized for each connector's dataset as needed

    • When connector data is cached, query results will be cached in memory for small/medium results set to further enhance performance. Query Cache will expire with data cache

    • Conduit SQL Engine will be used to run all queries

    • List of existing stored parquet files and their expected expiration times can be accessed on Performance>Parquet Store page

  • Enable Conduit SQL engine for hybrid join queries 

    • Enabled by default

    • When the checkbox is not checked, the reporting tool will throw a message to the analyst and won't run any hybrid joins (joins with tables from a different data source type or different PostgreSQL server instance queries). Running hybrid joins requires the Conduit SQL engine enabled.

    • Conduit SQL Engine will be used to run all queries when Connector Caching is checked

...

Authorization

Configure access for a selected Authentication type.

...

Authentication type and Authorization configuration can be changed at any time. If permissions are revoked, the data will no longer be accessible to external user(s) as well as connector to a restricted table will no longer be present in connector list in BI tools. 

...

Advanced

Fine-tune how your selections should be published.

...

  • Alias

    • A user-friendly table name to be used to identify published tables by external users

    • Optional; if not specified, real table name will be used for identification

  • Cache now

    • Displayed when Connector Cache enabled on Virtualization tab; disabled by default

    • Conduit will initiate caching of the data source on connector save to avoid waiting for cache upon initial query

  • Auto refresh

    • Displayed when Connector Cache enabled on Virtualization tab; enabled by default

    • Conduit will re-cache connector in Parquet Store when existing data cache expires

  • Caching Expiration

    • Displayed when Cache Query or Connector Cache has been enabled on Virtualization tab

    • Default cache expiration time is 24 hours, can be customized for each connector’s dataset as needed

    • Connectors to large datasets would benefit from having less frequent caching

    • After expiration, cache will re-create either when previous cache expires (if Auto refresh option enabled) or when a non-native or join query is ran (if Auto refresh option disabled)

  • Other settings

    • Partition column

      • You can select a column to physically divide cached data by for better query performance

      • By default no column is selected

    • Partition Count 

      • You can select how many partitions will be used by Spark. A partition in Spark is an atomic chunk of data (logical division of data) stored on a node in the cluster. Partitions are basic units of parallelism in Apache Spark.

      • By default partition count set to 4

...

Endpoints

This page contains the endpoints for the newly created connector that you can use to access the data from different applications:

...