Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Conduit makes it easy to connect your data to your favorite BI and data science tools, including Power BI. Your Elasticsearch data is approachable and interactive – in a matter of minutes, no matter where it's stored.

  • Data aggregation and JOINs with a familiar SQL query syntax at your fingertips. Native JOIN with other Elasticsearch datasets or hybrid JOIN with other supported connector types.

  • Automatic flattening and schema generation. Cherry-pick flattened data to use only specific columns needed for reporting to speed things up even more.

  • Advanced feature support, including arrays, multi-nested fields with several depth layers and multiple nested fields defined on the same level.

  • Access your data in real-time. Conduit allows you to connector in DirectQuery mode vs. Power BI’s standard import mode, which limits your data refreshes per day. 

  • Advanced Parquet Store cache for a fast performance. Configurable expiration and re-caching.

  • Built-in data governance and security controls. Flexible yet robust.

...

On this page:

...

Features

...

Prerequisites

Create Elasticsearch Connector

...

Datasource

...

Authentication

...

Publish

...

Virtualization

...

Authorization

...

Advanced

...

Table of Contents
minLevel1
maxLevel6
outlinefalse
styledisc
typelist
printablefalse

...

Prerequisites

If you haven’t already done so, be sure to sign up for a Conduit account.  Try the power and flexibility of Conduit firsthand with a free trial.

...

To cancel connector creation, click Close button.

...

Authentication

Define how external BI users should be authorized by Conduit to access specific data and how Conduit is connecting to the datasource.

...

To cancel connector creation, click Close button.

...

Publish

Select what data will be available to the BI users. Choose to publish one or more tables, specific columns only or entire table(s).

...

To cancel connector creation, click Close button.

...

Image Added

Virtualization

On Virtualization tab you can configure the following:

...

Authentication type and Authorization configuration can be changed at any time. If permissions are revoked, the data will no longer be accessible to external user(s) as well as connector to a restricted table will no longer be present in connector list in BI tools. 

...

Image Added

Advanced

Fine-tune how your selections should be published.

...

  • Alias

    • A user-friendly table name to be used to identify published tables by external users

    • Optional; if not specified, real table name will be used for identification

  • Cache now

    • Displayed when Connector Cache enabled on Virtualization tab; disabled by default

    • Conduit will initiate caching of the data source on connector save to avoid waiting for cache upon initial query

  • Auto refresh

    • Displayed when Connector Cache enabled on Virtualization tab; enabled by default

    • Conduit will re-cache connector in Parquet Store when existing data cache expires

  • Caching Expiration

    • Displayed when Cache Query or Connector Cache has been enabled on Virtualization tab

    • Default cache expiration time is 24 hours, can be customized for each connector’s dataset as needed

    • Connectors to large datasets would benefit from having less frequent caching

    • After expiration, cache will re-create either when previous cache expires (if Auto refresh option enabled) or when a non-native or join query is ran (if Auto refresh option disabled)

  •  Array Discovery settings

    • Force Array Scan

      • Each newly selected table (indice) will have “Force Array Scan” performed to ensure that Conduit determines arrays both that are declared and undeclared as “nested” type. Please keep in mind that “Force Array Scan” is resource demanding operation as it implies scanning the actual index documents. 

      • On subsequent table modifications “Force Array Scan” is unchecked by default and can be enabled if your data requires. 

    • Array Discovery Sample Size

      • Default array discovery sample size is 100 documents.

      • Depending on probability of empty array occurrence in certain fields in your source dataset, value adjustments may be needed to ensure that all array fields and values are discovered.

      • Use Full Index  instead of a specific sample size if needed. 

  • Other Settings

    • Fetch Size

      • The number of results per page from a single search request, in much the same way as you would use a cursor on a traditional database.

    • Query Timeout

      • A search timeout, bounding the search request to be executed within the specified time value and bail with the hits accumulated up to that point when expired. Search requests are canceled after the timeout is reached.

    • Partition Size

      • This parameter advises the connector what the maximum number of documents per input partition should be. The connector will sample and estimate the number of documents on each shard to be read and divides each shard into input slices using the value supplied by this property.

      • Property is ignored if you are reading from an Elasticsearch cluster that does not support scroll slicing (Elasticsearch any version below v5.0.0). By default, this value is unset, and the input partitions are calculated based on the number of shards in the indices being read.

...

Endpoints

This page contains the endpoints for the newly created connector that you can use to access the data from different applications:

...