Creating Pipeline in Oracle Endeca

So far I have written many posts about querying Endeca records from Oracle Endeca MDEX. In this post I would share the steps involved in creating Oracle Endeca Pipeline. I will be using a very basic product catalog data structure as the input for pipeline. The steps involved in creating a pipeline would be more or less same irrespective of the complexity or size of the data. To create an Oracle Endeca pipeline you should have installed Oracle Endeca Guided Search with Oracle Endeca Developer studio. Developer Studio is available only for Windows operating system. Please take a look at this post if you have not already installed the Oracle Endeca Guided Search.

Topics covered in this post.

  1. Prepare the Source Data.
  2. Classify/Categorize the Data. (Understand the Taxonomy)
  3. Oracle Endeca EAC Application Creation.
  4. Oracle Endeca Pipeline Creation.
  5. Oracle Endeca EAC Application Initialization.
  6. Indexing Data into MDEX.
  7. Testing Pipeline and Indexed Data Using jsp Reference Application.
Prepare the Source Data.

In this example we are using a very simple product catalog listed below. This data is stored in a pipe delimited file named data.txt. All the five records in the file are listed below. data.txt file can be found at ENDECA_HOME/Apps/simplecatalog/test_data/baseline. (C:\Endeca\Apps\simplecatalog\test_data\baseline\data.txt)

id name price brand color
1 Laptop 1233.23 hp Black
2 Laptop 1999.32 Lenovo Black
3 Laptop 340.67 Apple White
4 Laptop 500.18 HCL Brown
5 Laptop 790.90 Wipro White
Classify/Categorize the Data. (Understand the Taxonomy)

In the above given record set we have two simple properties and three dimensions. I haven’t included product category dimension in the source data to keep it simple.

Simple Properties.

name data type searchable
id alphanumeric yes
name alphanumeric yes

Dimensions.
name data type match mode searchable multiple select
price floating point normal – rage yes no
color alphanumeric auto gen yes no
brand alphanumeric auto gen yes no

Oracle Endeca EAC Application Creation.

The next step in creating a pipeline is to create an Oracle Endeca EAC application. To create an EAC application, you can use the deployment template given by Oracle Endeca along with ToolsAndFrameworks. Open a command prompt in admin mode (Check this post if you are not sure how to open a command prompt in admin mode in windows.) and move to ENDECA_HOME/ToolsAndFrameworks/version/deployment_template/bin (in my machine it was in C:\Endeca\ToolsAndFrameworks\3.1.2\deployment_template\bin\deploy.bat) and run deploy.bat. When you execute the batch program it would prompt you to enter the application name and port of EAC, MDEX graphs and so on. Please give the appropriate port numbers as per your local installation. In this example we are using “simplecatalog” as the application name. See the EAC application creation log in the appendix section of this post.

Oracle Endeca Pipeline Creation.

Now the time to create the pipeline as the taxonomy we identified in the second step. Start the Oracle Endeca Developer Studio and open the pipeline file simplecatalog.esp file from ENDECA_HOME/Apps/simplecatalog/config/pipeline. (In my machine the .esp file was located in C:\Endeca\Apps\simplecatalog\config\pipeline\simplecatalog.esp) I have captured the whole process of creating pipeline for the above mentioned data and taxonomy in a video. Please check the video given below.

Important steps in creating the pipeline.

  • Open the .esp file from the EAC simplecatalog application folder.
  • Add two simple properties (id, name).
  • Add three dimensions (brand, color, price).
  • Configure the source record – Endeca record mapping.
  • Add dimension values for price range dimension.

Oracle Endeca EAC Application Initialization.

To initialize simplecatalog application open a command prompt in admin mode and move to ENDECA_HOME/Apps/simplecatalog/control folder and run the initialize_services.bat. (In my machine initialize_services.bat was at C:\Endeca\Apps\simplecatalog\control). EAC application initialization log can be found in the appendix section.

Indexing Data into MDEX.

Moving data into MDEX is two steps for process for baseline index.

  • Run load_baseline_test_data.bat (C:\Endeca\Apps\simplecatalog\control\load_baseline_test_data.bat)
  • Run baseline_update.bat (C:\Endeca\Apps\simplecatalog\control\baseline_update.bat)

Console log for both bat programs can be found in the appendix section.

Testing Pipeline and Indexed Data Using jsp Reference Application.

You can test the newly created application using jsp reference application. Check the video given below to see the details.

Appendix.

Oracle Endeca EAC Application Creation Log.

Oracle Endeca EAC Application Initialization Log.

Load Baseline Test Data Log.

Update Baseline Test Data Log.

I hope this post helped you to understand the process involved in creating an Oracle Endeca Pipeline. I would be writing another post soon to talk about a JDBC pipeline.

8 thoughts on “Creating Pipeline in Oracle Endeca

  1. Hi Sanju, I wonder if you could give me an answer for my question. Cause my company considers using Endeca Software as search engine but we have some questions which can not be easily answered only using Oracle docs (I must mention I’m Endeca newbie so using only Oracle docs is still a little bit hard for me). Let me give you a simple example:
    Let’s assume we have an application which allows to search for houses/flats to rent.
    I want to rent a house and I have 3 “objects” to be a data source to Endeca query:
    1. Country
    – id (Simple Property)
    – name (Simple Property)
    – continent: List (Dimension)
    – temperature range: List (Dimension)
    2. Equipment
    – id (Simple Property)
    – name (Simple Property)
    – pool: Boolean (Dimension)
    – number of bedrooms List (Dimension)
    – size of garden List (Dimension)
    3. Time
    – start date: Date
    – end date: Date
    And as a customer I want to search for a particular house and fill all of attributes. And here is my proper question: how is data stored in Endeca? Does it keeps all of “permutation” of possible answers as a plane data source? Or it builds query “ad hock” and works as “join” in slq? I don’t know if my question is not stupid :) But as I mentioned I’m Endeca newbie and I wonder how it meets efficiency requirements and also how is data stored. I will be very great full if you could answer my question.

  2. Yes, data is de-normalized and stored in key value form. This is on the very high level. MDEX is black box and I don’t know the internal representation of data. If you need a detailed explanation please take a look at page 20 of basic development guide. (http://docs.oracle.com/cd/E28910_01/MDEX.622/pdf/BasicDevGuide.pdf) On the very high level if you are looking at query speed you will look at separate record for every combination. Reducing the number of records would bring down the indexing time. In that case you may maintain Lists as multi value attributes instead of separate records.

  3. i started doing the same steps but when i run the intialize services.bat. i am getting the below error,

    C:\Endeca\Apps\simplecatalog\control>initialize_services.bat
    [07.12.14 13:07:57] SEVERE: Caught exception while querying for defined application list.
    Occurred while executing line 18 of valid BeanShell script:
    [[

    15|
    16| // If the application is already defined
    17| // log an error and exit with code 1
    18| if (app.isDefined()) {
    19| log.severe("An application already exists with the name, \"" + provObj.getAppName() + "\". " +
    20| "Please use the '--force' option if you want to replace all existing configuration.");
    21| System.exit(1);

    ]]

    [07.12.14 13:07:57] SEVERE: Caught an exception while invoking method ‘run’ on object ‘AssertNotDefined’. Releasing locks.

    Caused by java.lang.reflect.InvocationTargetException
    sun.reflect.NativeMethodAccessorImpl invoke0 – null
    Caused by com.endeca.soleng.eac.toolkit.exception.AppControlException
    com.endeca.soleng.eac.toolkit.script.Script runBeanShellScript – Error executing valid BeanShell script.
    Caused by com.endeca.soleng.eac.toolkit.exception.EacCommunicationException
    com.endeca.soleng.eac.toolkit.application.Application isDefined – Caught exception while querying for defined application list.
    Caused by org.apache.axis.AxisFault
    org.apache.axis.AxisFault makeFault – ; nested exception is:
    java.net.ConnectException: Connection refused: connect
    Caused by java.net.ConnectException
    java.net.PlainSocketImpl socketConnect – Connection refused: connect

    Can you please help me on this

  4. hi,
    thanks for the detailed steps
    Im getting this warning when i ran the baseling update for my sample application.

    WARN 04/01/15 12:17:54.441 UTC (1427890674441) FORGE {baseline}: Property ‘color
    ‘ is not a valid NCName.
    WARN 04/01/15 12:17:54.581 UTC (1427890674579) FORGE {baseline}: Forge completed with 0 errors and 1 warning.

    Could you please let me know what is the issue here. i followed the same steps as mentioned above. Color dimensions is not picking up and its is not displayed in JSP page. Anticipate your help

    Regards,
    Laxmana

Leave a Reply to Jayaraj Cancel reply

Your email address will not be published. Required fields are marked *