So far I have written many posts about querying Endeca records from Oracle Endeca MDEX. In this post I would share the steps involved in creating Oracle Endeca Pipeline. I will be using a very basic product catalog data structure as the input for pipeline. The steps involved in creating a pipeline would be more or less same irrespective of the complexity or size of the data. To create an Oracle Endeca pipeline you should have installed Oracle Endeca Guided Search with Oracle Endeca Developer studio. Developer Studio is available only for Windows operating system. Please take a look at this post if you have not already installed the Oracle Endeca Guided Search.
Topics covered in this post.
- Prepare the Source Data.
- Classify/Categorize the Data. (Understand the Taxonomy)
- Oracle Endeca EAC Application Creation.
- Oracle Endeca Pipeline Creation.
- Oracle Endeca EAC Application Initialization.
- Indexing Data into MDEX.
- Testing Pipeline and Indexed Data Using jsp Reference Application.
Prepare the Source Data.
In this example we are using a very simple product catalog listed below. This data is stored in a pipe delimited file named data.txt. All the five records in the file are listed below. data.txt file can be found at ENDECA_HOME/Apps/simplecatalog/test_data/baseline. (C:\Endeca\Apps\simplecatalog\test_data\baseline\data.txt)
id | name | price | brand | color |
---|---|---|---|---|
1 | Laptop | 1233.23 | hp | Black |
2 | Laptop | 1999.32 | Lenovo | Black |
3 | Laptop | 340.67 | Apple | White |
4 | Laptop | 500.18 | HCL | Brown |
5 | Laptop | 790.90 | Wipro | White |
Classify/Categorize the Data. (Understand the Taxonomy)
In the above given record set we have two simple properties and three dimensions. I haven’t included product category dimension in the source data to keep it simple.
Simple Properties.
name | data type | searchable |
---|---|---|
id | alphanumeric | yes |
name | alphanumeric | yes |
Dimensions.
name | data type | match mode | searchable | multiple select |
---|---|---|---|---|
price | floating point | normal – rage | yes | no |
color | alphanumeric | auto gen | yes | no |
brand | alphanumeric | auto gen | yes | no |
Oracle Endeca EAC Application Creation.
The next step in creating a pipeline is to create an Oracle Endeca EAC application. To create an EAC application, you can use the deployment template given by Oracle Endeca along with ToolsAndFrameworks. Open a command prompt in admin mode (Check this post if you are not sure how to open a command prompt in admin mode in windows.) and move to ENDECA_HOME/ToolsAndFrameworks/version/deployment_template/bin (in my machine it was in C:\Endeca\ToolsAndFrameworks\3.1.2\deployment_template\bin\deploy.bat) and run deploy.bat. When you execute the batch program it would prompt you to enter the application name and port of EAC, MDEX graphs and so on. Please give the appropriate port numbers as per your local installation. In this example we are using “simplecatalog” as the application name. See the EAC application creation log in the appendix section of this post.
Oracle Endeca Pipeline Creation.
Now the time to create the pipeline as the taxonomy we identified in the second step. Start the Oracle Endeca Developer Studio and open the pipeline file simplecatalog.esp file from ENDECA_HOME/Apps/simplecatalog/config/pipeline. (In my machine the .esp file was located in C:\Endeca\Apps\simplecatalog\config\pipeline\simplecatalog.esp) I have captured the whole process of creating pipeline for the above mentioned data and taxonomy in a video. Please check the video given below.
Important steps in creating the pipeline.
- Open the .esp file from the EAC simplecatalog application folder.
- Add two simple properties (id, name).
- Add three dimensions (brand, color, price).
- Configure the source record – Endeca record mapping.
- Add dimension values for price range dimension.
Oracle Endeca EAC Application Initialization.
To initialize simplecatalog application open a command prompt in admin mode and move to ENDECA_HOME/Apps/simplecatalog/control folder and run the initialize_services.bat. (In my machine initialize_services.bat was at C:\Endeca\Apps\simplecatalog\control). EAC application initialization log can be found in the appendix section.
Indexing Data into MDEX.
Moving data into MDEX is two steps for process for baseline index.
- Run load_baseline_test_data.bat (C:\Endeca\Apps\simplecatalog\control\load_baseline_test_data.bat)
- Run baseline_update.bat (C:\Endeca\Apps\simplecatalog\control\baseline_update.bat)
Console log for both bat programs can be found in the appendix section.
Testing Pipeline and Indexed Data Using jsp Reference Application.
You can test the newly created application using jsp reference application. Check the video given below to see the details.
Appendix.
Oracle Endeca EAC Application Creation Log.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 |
C:\Endeca\ToolsAndFrameworks\3.1.2\deployment_template\bin>deploy.bat ------------------------------------------------------------------------------ Found version 6.1 of the Endeca IAP installed in directory C:\Endeca\PlatformServices\6.1.3. If either the version or location are incorrect, type 'Q' to quit and adjust your ENDECA_ROOT environment variable. Press enter to continue with these settings. Continue? ------------------------------------------------------------------------------ Deployment Template installation script. This script creates the directory structure for your deployment and installs configuration files and scripts into the directory structure. 12/29/2013 10:54:04 [deploy.pl] INFO: Starting deployment template installation. 12/29/2013 10:54:04 [AppDescriptorReader] INFO: Parsing application descriptor file C:\Endeca\ToolsAndFrameworks\3.1.2\deployment_template\app-templates\base_descriptor.xml. ------------------------------------------------------------------------------ Enter a short name for your application. Note: The name must conform to this regular expression: ^[a-zA-Z0-9]+$ Application name: simplecatalog ------------------------------------------------------------------------------ Specify the path into which the application will be deployed. The specified directory must exist and cannot contain spaces. For example, to deploy into c:\apps\simplecatalog, specify the path as c:\apps. Deployment directory: C:\Endeca\Apps ------------------------------------------------------------------------------ Specify the port on which the Endeca Application Controller is running. This is configured in the server.xml file in the workspace of the Endeca software install and should be the same for all applications deployed in this environment. Ports must be in the range 1024-65535 [default: 8888]. EAC port: 8888 12/29/2013 10:54:32 [deploy.pl] INFO: Deploying application into C:\Endeca\Apps\simplecatalog ------------------------------------------------------------------------------ What port is the Workbench running? [Default: 8006] 8006 ------------------------------------------------------------------------------ What port should be used for the Live Dgraph? [Default: 15000] 17000 ------------------------------------------------------------------------------ What port should be used for the Authoring Dgraph? [Default: 15002] 17002 ------------------------------------------------------------------------------ What port should be used for LogServer? [Default: 15010] 17010 12/29/2013 10:54:54 [AppDescriptorReader] INFO: Parsing application descriptor file C:\Endeca\ToolsAndFrameworks\3.1.2\deployment_template\app-templates\base_descriptor.xml. 12/29/2013 10:54:54 [deploy.pl] INFO: Processing install with id 'Dgraph' 12/29/2013 10:54:55 [deploy.pl] INFO: Application successfully deployed. C:\Endeca\ToolsAndFrameworks\3.1.2\deployment_template\bin> |
Oracle Endeca EAC Application Initialization Log.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 |
C:\Endeca\Apps\simplecatalog\control>initialize_services.bat Setting EAC provisioning and performing initial setup... [12.29.13 17:34:09] INFO: Checking definition from AppConfig.xml against existing EAC provisioning. [12.29.13 17:34:09] INFO: Setting definition for application 'simplecatalog'. [12.29.13 17:34:10] INFO: Setting definition for host 'AuthoringMDEXHost'. [12.29.13 17:34:11] INFO: Setting definition for host 'LiveMDEXHostA'. [12.29.13 17:34:12] INFO: Setting definition for host 'ReportGenerationHost'. [12.29.13 17:34:13] INFO: Setting definition for host 'WorkbenchHost'. [12.29.13 17:34:13] INFO: Setting definition for host 'ITLHost'. [12.29.13 17:34:14] INFO: Setting definition for component 'AuthoringDgraph'. [12.29.13 17:34:16] INFO: [AuthoringMDEXHost] Starting shell utility 'mkpath_-data-dgidx-output'. [12.29.13 17:34:19] INFO: [AuthoringMDEXHost] Starting shell utility 'mkpath_-data-partials-forge-ou tput'. [12.29.13 17:34:21] INFO: [AuthoringMDEXHost] Starting shell utility 'mkpath_-data-partials-cumulati ve-partials'. [12.29.13 17:34:22] INFO: [AuthoringMDEXHost] Starting shell utility 'mkpath_-data-workbench-dgraph- config'. [12.29.13 17:34:24] INFO: [AuthoringMDEXHost] Starting shell utility 'mkpath_-data-dgraphs-local-dgr aph-input'. [12.29.13 17:34:25] INFO: [AuthoringMDEXHost] Starting shell utility 'mkpath_-data-dgraphs-local-cum ulative-partials'. [12.29.13 17:34:26] INFO: [AuthoringMDEXHost] Starting shell utility 'mkpath_-data-dgraphs-local-dgr aph-config'. [12.29.13 17:34:27] INFO: Setting definition for component 'DgraphA1'. [12.29.13 17:34:28] INFO: Setting definition for script 'PromoteAuthoringToLive'. [12.29.13 17:34:28] INFO: Setting definition for custom component 'WorkbenchManager'. [12.29.13 17:34:28] INFO: Updating provisioning for host 'ITLHost'. [12.29.13 17:34:28] INFO: Updating definition for host 'ITLHost'. [12.29.13 17:34:29] INFO: [ITLHost] Starting shell utility 'mkpath_-'. [12.29.13 17:34:30] INFO: [ITLHost] Starting shell utility 'mkpath_-data-workbench-temp'. [12.29.13 17:34:31] INFO: Setting definition for custom component 'IFCR'. [12.29.13 17:34:31] INFO: Updating provisioning for host 'ITLHost'. [12.29.13 17:34:31] INFO: Updating definition for host 'ITLHost'. [12.29.13 17:34:32] INFO: [ITLHost] Starting shell utility 'mkpath_-'. [12.29.13 17:34:33] INFO: [ITLHost] Starting shell utility 'mkpath_-'. [12.29.13 17:34:34] INFO: Setting definition for component 'LogServer'. [12.29.13 17:34:35] INFO: [ReportGenerationHost] Starting shell utility 'mkpath_-reports-input'. [12.29.13 17:34:36] INFO: Setting definition for script 'DaySoFarReports'. [12.29.13 17:34:36] INFO: Setting definition for script 'DailyReports'. [12.29.13 17:34:36] INFO: Setting definition for script 'WeeklyReports'. [12.29.13 17:34:36] INFO: Setting definition for script 'DaySoFarHtmlReports'. [12.29.13 17:34:37] INFO: Setting definition for script 'DailyHtmlReports'. [12.29.13 17:34:37] INFO: Setting definition for script 'WeeklyHtmlReports'. [12.29.13 17:34:37] INFO: Setting definition for component 'WeeklyReportGenerator'. [12.29.13 17:34:37] INFO: Setting definition for component 'DailyReportGenerator'. [12.29.13 17:34:38] INFO: Setting definition for component 'DaySoFarReportGenerator'. [12.29.13 17:34:38] INFO: Setting definition for component 'WeeklyHtmlReportGenerator'. [12.29.13 17:34:38] INFO: Setting definition for component 'DailyHtmlReportGenerator'. [12.29.13 17:34:38] INFO: Setting definition for component 'DaySoFarHtmlReportGenerator'. [12.29.13 17:34:39] INFO: Setting definition for script 'BaselineUpdate'. [12.29.13 17:34:39] INFO: Setting definition for script 'PartialUpdate'. [12.29.13 17:34:39] INFO: Setting definition for component 'Forge'. [12.29.13 17:34:40] INFO: [ITLHost] Starting shell utility 'mkpath_-data-incoming'. [12.29.13 17:34:41] INFO: Setting definition for component 'PartialForge'. [12.29.13 17:34:42] INFO: [ITLHost] Starting shell utility 'mkpath_-data-partials-incoming'. [12.29.13 17:34:43] INFO: Setting definition for component 'Dgidx'. [12.29.13 17:34:43] INFO: Definition updated. [12.29.13 17:34:43] INFO: Provisioning site from prototype... [12.29.13 17:34:45] INFO: Finished provisioning site from prototype. [12.29.13 17:34:45] INFO: Uploading config files to Workbench. [12.29.13 17:34:45] INFO: [ITLHost] Starting shell utility 'emgr_update_update_mgr_settings'. [12.29.13 17:35:07] INFO: Finished uploading config files to Workbench. Finished updating EAC. Importing editors configuration... [12.29.13 17:35:10] INFO: Checking definition from AppConfig.xml against existing EAC provisioning. [12.29.13 17:35:11] INFO: Definition has not changed. [12.29.13 17:35:11] INFO: Packaging contents for upload... [12.29.13 17:35:11] INFO: Finished packaging contents. [12.29.13 17:35:11] INFO: Uploading contents to: http://thosan-VAIO:8006/ifcr/sites/simplecatalog/co nfiguration/tools/xmgr [12.29.13 17:35:13] INFO: Finished uploading contents. Finished importing editors configuration Importing media... Finished importing media Importing templates... Removing existing cartridge templates for simplecatalog Setting new cartridge templates for simplecatalog There are no templates in the specified directory. C:\Endeca\Apps\simplecatalog\control> |
Load Baseline Test Data Log.
1 2 3 4 5 |
C:\Endeca\Apps\simplecatalog\control>load_baseline_test_data.bat C:\Endeca\Apps\simplecatalog\config\script\..\..\test_data\baseline\data.txt 1 file(s) copied. Setting flag 'baseline_data_ready' in the EAC. C:\Endeca\Apps\simplecatalog\control> |
Update Baseline Test Data Log.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 |
C:\Endeca\Apps\simplecatalog\control>baseline_update.bat [12.29.13 19:03:58] INFO: Checking definition from AppConfig.xml against existing EAC provisioning. [12.29.13 19:03:59] INFO: Definition has not changed. [12.29.13 19:03:59] INFO: Starting baseline update script. [12.29.13 19:03:59] INFO: Acquired lock 'update_lock'. [12.29.13 19:03:59] INFO: [ITLHost] Starting shell utility 'move_-_to_processing'. [12.29.13 19:04:01] INFO: [ITLHost] Starting copy utility 'fetch_config_to_input_for_forge_Forge'. [12.29.13 19:04:02] INFO: [ITLHost] Starting backup utility 'backup_log_dir_for_component_Forge'. [12.29.13 19:04:03] INFO: [ITLHost] Starting component 'Forge'. [12.29.13 19:04:17] INFO: [ITLHost] Starting backup utility 'backup_log_dir_for_component_Dgidx'. [12.29.13 19:04:18] INFO: [ITLHost] Starting component 'Dgidx'. [12.29.13 19:04:28] INFO: [AuthoringMDEXHost] Starting copy utility 'copy_index_to_host_AuthoringMDE XHost_AuthoringDgraph'. [12.29.13 19:04:30] INFO: Applying index to dgraphs in restart group 'A'. [12.29.13 19:04:30] INFO: [AuthoringMDEXHost] Starting shell utility 'mkpath_dgraph-input-new'. [12.29.13 19:04:31] INFO: [AuthoringMDEXHost] Starting copy utility 'copy_index_to_temp_new_dgraph_i nput_dir_for_AuthoringDgraph'. [12.29.13 19:04:33] INFO: [AuthoringMDEXHost] Starting shell utility 'move_dgraph-input_to_dgraph-in put-old'. [12.29.13 19:04:34] INFO: [AuthoringMDEXHost] Starting shell utility 'move_dgraph-input-new_to_dgrap h-input'. [12.29.13 19:04:36] INFO: [AuthoringMDEXHost] Starting backup utility 'backup_log_dir_for_component_ AuthoringDgraph'. [12.29.13 19:04:37] INFO: [AuthoringMDEXHost] Starting component 'AuthoringDgraph'. [12.29.13 19:04:43] INFO: Publishing Workbench 'authoring' configuration to MDEX 'AuthoringDgraph' [12.29.13 19:04:43] INFO: Pushing authoring content to dgraph: AuthoringDgraph [12.29.13 19:04:44] INFO: Finished pushing content to dgraph. [12.29.13 19:04:44] INFO: [AuthoringMDEXHost] Starting shell utility 'rmdir_dgraph-input-old'. [12.29.13 19:04:45] INFO: [LiveMDEXHostA] Starting shell utility 'cleanDir_local-dgraph-input'. [12.29.13 19:04:47] INFO: [LiveMDEXHostA] Starting copy utility 'copy_index_to_host_LiveMDEXHostA_Dg raphA1'. [12.29.13 19:04:49] INFO: Applying index to dgraphs in restart group '1'. [12.29.13 19:04:49] INFO: [LiveMDEXHostA] Starting shell utility 'mkpath_dgraph-input-new'. [12.29.13 19:04:50] INFO: [LiveMDEXHostA] Starting copy utility 'copy_index_to_temp_new_dgraph_input _dir_for_DgraphA1'. [12.29.13 19:04:52] INFO: [LiveMDEXHostA] Starting shell utility 'move_dgraph-input_to_dgraph-input- old'. [12.29.13 19:04:53] INFO: [LiveMDEXHostA] Starting shell utility 'move_dgraph-input-new_to_dgraph-in put'. [12.29.13 19:04:55] INFO: [LiveMDEXHostA] Starting backup utility 'backup_log_dir_for_component_Dgra phA1'. [12.29.13 19:04:56] INFO: [LiveMDEXHostA] Starting component 'DgraphA1'. [12.29.13 19:05:02] INFO: Publishing Workbench 'live' configuration to MDEX 'DgraphA1' [12.29.13 19:05:02] INFO: Pushing live content to dgraph: DgraphA1 [12.29.13 19:05:02] INFO: Finished pushing content to dgraph. [12.29.13 19:05:03] INFO: [LiveMDEXHostA] Starting shell utility 'rmdir_dgraph-input-old'. [12.29.13 19:05:13] INFO: [ITLHost] Starting copy utility 'fetch_post_forge_dimensions_to_ws_temp_di r_C-Endeca-Apps-simplecatalog-config-script-data-workbench-temp'. [12.29.13 19:05:14] INFO: Uploading post-Forge Dimensions to Workbench. [12.29.13 19:05:16] INFO: [ITLHost] Starting shell utility 'emgr_update_set_post_forge_dims'. [12.29.13 19:05:23] INFO: [ITLHost] Starting backup utility 'backup_state_dir_for_component_Forge'. [12.29.13 19:05:23] INFO: [ITLHost] Starting backup utility 'backup_index_Dgidx'. [12.29.13 19:05:28] INFO: [ReportGenerationHost] Starting backup utility 'backup_log_dir_for_compone nt_LogServer'. [12.29.13 19:05:28] INFO: [ReportGenerationHost] Starting component 'LogServer'. [12.29.13 19:05:30] INFO: Released lock 'update_lock'. [12.29.13 19:05:30] INFO: Baseline update script finished. C:\Endeca\Apps\simplecatalog\control> |
I hope this post helped you to understand the process involved in creating an Oracle Endeca Pipeline. I would be writing another post soon to talk about a JDBC pipeline.
Pingback: How to create a pipeline in Oracle Endeca — our…
Hi Sanju, I wonder if you could give me an answer for my question. Cause my company considers using Endeca Software as search engine but we have some questions which can not be easily answered only using Oracle docs (I must mention I’m Endeca newbie so using only Oracle docs is still a little bit hard for me). Let me give you a simple example:
But as I mentioned I’m Endeca newbie and I wonder how it meets efficiency requirements and also how is data stored. I will be very great full if you could answer my question.
Let’s assume we have an application which allows to search for houses/flats to rent.
I want to rent a house and I have 3 “objects” to be a data source to Endeca query:
1. Country
– id (Simple Property)
– name (Simple Property)
– continent: List (Dimension)
– temperature range: List (Dimension)
2. Equipment
– id (Simple Property)
– name (Simple Property)
– pool: Boolean (Dimension)
– number of bedrooms List (Dimension)
– size of garden List (Dimension)
3. Time
– start date: Date
– end date: Date
And as a customer I want to search for a particular house and fill all of attributes. And here is my proper question: how is data stored in Endeca? Does it keeps all of “permutation” of possible answers as a plane data source? Or it builds query “ad hock” and works as “join” in slq? I don’t know if my question is not stupid
Yes, data is de-normalized and stored in key value form. This is on the very high level. MDEX is black box and I don’t know the internal representation of data. If you need a detailed explanation please take a look at page 20 of basic development guide. (http://docs.oracle.com/cd/E28910_01/MDEX.622/pdf/BasicDevGuide.pdf) On the very high level if you are looking at query speed you will look at separate record for every combination. Reducing the number of records would bring down the indexing time. In that case you may maintain Lists as multi value attributes instead of separate records.
i started doing the same steps but when i run the intialize services.bat. i am getting the below error,
C:\Endeca\Apps\simplecatalog\control>initialize_services.bat
[07.12.14 13:07:57] SEVERE: Caught exception while querying for defined application list.
Occurred while executing line 18 of valid BeanShell script:
[[
15|
16| // If the application is already defined
17| // log an error and exit with code 1
18| if (app.isDefined()) {
19| log.severe("An application already exists with the name, \"" + provObj.getAppName() + "\". " +
20| "Please use the '--force' option if you want to replace all existing configuration.");
21| System.exit(1);
]]
[07.12.14 13:07:57] SEVERE: Caught an exception while invoking method ‘run’ on object ‘AssertNotDefined’. Releasing locks.
Caused by java.lang.reflect.InvocationTargetException
sun.reflect.NativeMethodAccessorImpl invoke0 – null
Caused by com.endeca.soleng.eac.toolkit.exception.AppControlException
com.endeca.soleng.eac.toolkit.script.Script runBeanShellScript – Error executing valid BeanShell script.
Caused by com.endeca.soleng.eac.toolkit.exception.EacCommunicationException
com.endeca.soleng.eac.toolkit.application.Application isDefined – Caught exception while querying for defined application list.
Caused by org.apache.axis.AxisFault
org.apache.axis.AxisFault makeFault – ; nested exception is:
java.net.ConnectException: Connection refused: connect
Caused by java.net.ConnectException
java.net.PlainSocketImpl socketConnect – Connection refused: connect
Can you please help me on this
hi nice explanation for extending commerce pipeline. can you tell me how to create a a new pipeline chain and how to invoke it through formhandler..
hi nice explanation for extending commerce pipeline. can you tell me how to create a a new pipeline chain and how to invoke it through formhandler..
can u send publish video for creating pipeline
hi,
thanks for the detailed steps
Im getting this warning when i ran the baseling update for my sample application.
WARN 04/01/15 12:17:54.441 UTC (1427890674441) FORGE {baseline}: Property ‘color
‘ is not a valid NCName.
WARN 04/01/15 12:17:54.581 UTC (1427890674579) FORGE {baseline}: Forge completed with 0 errors and 1 warning.
Could you please let me know what is the issue here. i followed the same steps as mentioned above. Color dimensions is not picking up and its is not displayed in JSP page. Anticipate your help
Regards,
Laxmana