Javatpoint Logo
Javatpoint Logo

Centralizing File Regex Metadata

In this section, we will learn how to centralize Regex File Metadata in Talend Studio for Data integration platform.

Before going further in this chapter first, we will understand why we will use Regex Files.

The Files which are made of regular expressions used by Regex File schemas.

For example: Log Files

If we want to connect to a regex File, we will centralize the connection and schema information in the Repository for reusability.

To create the Regex File connection from the beginning:

  • Go to the Repository panel.
  • Then expand the Metadata and right-click on the File Regex, and select Create File Regex option in the popup menu, as we can see in the below image:

Repository → Metadata → File Regex → Create file regex

Centralizing File Regex Metadata

Note: To use the centralized File Regex in our job, go to the basic setting view of the necessary components with its property typeset as build-in for opening the File Metadata setup window.

Then New RegEx File window will open where both the File connection and schema definitions are completed in four steps:

  • Define General properties
  • Defining File path and Format
  • Define File Parsing Parameters
  • Checking and customizing the File schema

Step 1: Defining General Properties

In the first step, we will fill all the necessary details like Name, which is a mandatory field, and the Purpose and Description fields if we want to more specific.

We can also manage the version and status fields of a Repository item in the project setting dialog box.

Click on the Select button next to the Path field for selecting a folder under the File Regex node to hold our newly created File connection.

Note: we cannot select a folder if we are editing an existing connection, but we can drag and drop it to a new folder whenever we want.

After filling all the details of general properties, click on the Next button.

Centralizing File Regex Metadata

Step 2: Defining File path and Format

In the next step, we will click on the Browse button to locate our File from the local system.

For example, we will select the customer_regex.txt File from our system.

  • Select the Format related to our.txt
  • For this, we are selecting the Format as Windows from the given drop-down list.
  • If the suitable format is not available in the given drop-down list, ignore it.
  • We have the File Viewer, which gives an instant picture of the File loaded, and, as we can see in the below snapshot:
Centralizing File Regex Metadata
  • After that, click on the Next button to process further.

Step 3: Define File Parsing Parameters

In this step, we describe the File parsing variable to recover the File schema properly.

  • We can set the Field and Row separators in the File Settings
  • If the Row Separator of our File is not the Standard EOL [end of line], we can select the Custom String from the Row Separator drop-down list, and write the character string in the Corresponding Character
  • To enter the Regular expression, we will go to the Regular Expression settings for delimiting the File.

Note: Regular Expression: It is used to search for specific patterns of text; we can create a regular expression for any pattern of text.

As we can notice in the below screenshot:

Centralizing File Regex Metadata

Note: The Regex code would be written in a single or double-quotes.

The regular expression for our text File is: "custname=(.+)city=(.+)"

Here,

[.] It is a special character that is used to match any single character.

[+] It is used to match the preceding element one or more times.

  • To view the new setting impact, look into the File Review Panel, and check the set handling row as column names box to transform the first parsed row as labels for schema columns.

To see the effect and result view, on the viewer, click on the Refresh Preview button.

Centralizing File Regex Metadata

After that, click on the Next button.

Step 4: Checking and Customizing the File schema

In the last step, we will check and customize the File schema:

  • To customize the File schema, check the data type in the Type column is correct or not, and in the description of schema section, we can modify the column name as we mentioned in the actual File.
  • The Guess button is used to update and recover the Regex File schema.
  • After that, click on the Finish button, as we can see in the below image:
Centralizing File Regex Metadata

To see the newly created Metadata in the Talend studio:

  • Go to the Repository panel then go to Metadata.
  • After that, expand the File Regex node, as we can see in the below screenshot:

Repository → Metadata → File Regex → customer_regex

Centralizing File Regex Metadata

To reuse the Metadata as a new component or the existing component, simply drag the File connection or schema from the Repository's Metadata node and drop it to the design workspace window.

For modifying the existing File connection:

  • Go to the Repository panel, then go to the Metadata node.
  • After that, expand the File Regex, and right-click on the schema and select Edit File regex as we can see in the below image:
Centralizing File Regex Metadata

For adding a new schema to an existing File connection:

  • Go to the Repository panel, and right-click on the File Regex.
  • Select Retrieve Schema from the popup menu in the Metadata, as we can see in the below image:
Centralizing File Regex Metadata





Youtube For Videos Join Our Youtube Channel: Join Now

Feedback


Help Others, Please Share

facebook twitter pinterest

Learn Latest Tutorials


Preparation


Trending Technologies


B.Tech / MCA