BI / ETL / DWH: 11g

Showing posts with label 11g. Show all posts

Tuesday, 24 November 2015

How-to: Mapviewer Integration with OBIEE 11g (11.1.1.6 and higher)

I've seen quite a few articles on OTN and Google which outline how to configure Mapviewer for your 11g solution. The problem is that many of these articles are:

1) overly complex
2) out of date

With the release of OBIEE 11.1.1.6, mapviewer comes pre-configured with weblogic and the only installation steps required are:

installation of navteq mapdata to your oracle database
establish a column based relationship between the map and a subject area

You do not have to modify any weblogic XML files, install any mapviewer jar files, or do any configuration within weblogic. Below is a step by step guide on how to configure and use mapviewer on your 11.1.1.6.x box:

Step 1: Download the mapviewer 'mvdemo' data into your Oracle database

Oracle provides pre-configured maps, coordinates, and navteq data that use can use in your reports. You need to download the MVDemo Sample Data Set .

Step 2: Create the mvdemo schema in your Oracle database

We're going to import the navteq map data into an mvdemo schema in step 3. Let's go ahead and create the mvdemo schema by typing the following command into your SQLPlus prompt:

grant connect, resource, create view to mvdemo identified by mvdemo

Note that you will need the appropriate privileges to create this account. If denied, try logging in as sysdba by typing the following command into sqplus:

CONNECT / AS sysdba

Step 3: Import navteq data dump to your Oracle database

Unzip the MV Demo Sample Data Set you just downloaded, and note the location of the 'mvdemp.dmp' file. This is the file we will import to the database.

Step 3.1)
Find the imp utility on your machine. It is usually located in your $ORACLE_HOME\product\11.x.x\dbhome_1\BIN folder

Step 3.2) Navigate to that folder via command line and run the following command:

imp mvdemo/mvdemo@ORCL file=mvdemo.dmp full=y ignore=y
where ORCL is your database SID
and file=mvdemo.dmp is the path (including mvdemo.dmp) to the dump file

You should get the following result:

Step 4) Import the map meta data

The Map Viewer Sample Data Set comes with city, county, state, highway, and other geographical map based data that you can impose onto the map. We're going to import this data set by running the following command in sqlplus:
:
@mvdemo.sql

Note that you must include the path of the mvdemo.sql file e.g. @C:\folder\folder1\mvdemo\mvdemo.sql

Step 5) Add your Oracle Database as a MapViewer Data Source

No, we're not doing this in weblogic - Mapviewer data source configuration is still done in your http://localhost:9704/mapviewer location

You'll arrive at a landing page like below, where you must click the 'Admin' button to log in:

5.1) login user your weblogic username/password

5.2) You should arrive at a home page with a link to view Datasources. Click it and you'll appear at:

Name = name/description of your data source

Host = hostname/ip address of your database

Port = database port number

SID = service ID of your oracle database (by default it is orcl)

user/password: If you followed my above steps, it will be mvdemo/mvdemo

# Mappers and Max Connections specify how many simultaneous users can connect to the MapViewer db. For diagnostic purposes I would make this relatively high, and once development is complete you can adjust as needed.

Step 6) Modify the mapViewerConfig.xml file to include the new data source

I'm a little surprised as to why this must be done manually, if anyone has any insight - please feel free to leave feedback. After you add the data source as outlined in step 5, you must then modify the mapViewerConfig.XML file to include said datasource, otherwise when the BI Server is rebooted, your datasource connection will be removed!

Luckily, this step is not too difficult

6.1) In :7001/mapviewer , log into your Admin screen and navigate to Management -> Configuration

Then add the following XML to the bottom of the config file, right above the </MappperConfig> line.

<map_data_source name="mvdemo"

jdbc_host="db1.my_corp.com"

jdbc_sid="orcl"

jdbc_port="1521"

jdbc_user="scott"

jdbc_password="!tiger"

jdbc_mode="thin"

number_of_mappers="3"

allow_jdbc_theme_based_foi="false"

Modify each line using the inputs you provided in step 5, but in the jdbc_password input should have a ! infront of it as that is Oracle's indicator to encrypt the password upon restart.

Step 7) Import a map layer into Answers

We've completed all of the back end work required to create a map. Now we'll go into Answers -> Administration -> Manage Map Data and import a layer (theme) that we'll use for our map.

A theme is a visual representation representation of the data, and arguably the most important component in creating a map. In this example let's use the "THEME_DEMO_COUNTIES" layer, which will give us the ability to impose a dataset over various Counties in the USA.

Step 8) Specify the BI Column used to associate the THEME_DEMO_COUNTIES layer to a dataset

The theme we're using, "THEME_DEMO_COUNTIES" stores attributes of Counties (county name, county lines, etc) which we can visualize on a map. We need to identify a way to 'join' the data set in OBIEE to the dataset of THEME_DEMO_COUNTIES.

After saving the layer you just created, click the 'Edit' button (pencil) to bring up the screen below.

Notice there is 'Layer Key' with the following column values; County, Fips, Population, State. We are going to use 'County' as the map column to join to the subject area.

Next we need to specify a column from our subject area which contains 'County'.

Step 9) Specify the Background Map

In steps 7 and 8 we specified a theme (visual representation of the data) and identified how to join the map data to our subject area (via County column). Now we need to specify which map we're the theme will use.

In the 'Background' tab, create a new background map and specify 'DEMO_MAP' as the background map.

After saving, edit the map ensure the THEME_DEMO_COUNTIES has successfully been applied:

It will default to the middle of the USA but I decided to zoom into California :)

Step 10) Create a Report using the County Column

Now we're ready to create the report! Create a new analysis, select the County column you specified in step 7, and a fact column which joins to the county dimension. Click the results tab, then New View -> Maps.

The result below outlines only California because the dataset I created uses only California Counties.

Note that I did not go into the MapBuilder tool, which you can use if you want to create custom themes and maps (e.g. map of a building, school, casino, etc). But this works great for a proof of concept!

keywords: obiee mapviewer, rendering maps, obiee configuration, obiee 11g maps, obiee navteq maps, obiee mapviewer integration

How-to: Bridge Tables and Many to Many Relationships Demystified in OBIEE 11g

Bridge tables - entire books have been devoted to this concept, countless blogs write about it, and organizations offer entire classes dedicated to demystifying this idea. Ralph Kimball , creator of Kimball Dimensional Modeling and founder of the Kimball Group has written quite a few great articles discussing the theory of bridge tables.

Yet when researching for comprehensive guides on how to actually implement a bridge table in OBIEE 11g, the documentation available is either:

Out of date

Contains implementation steps for OBIEE 10g which has since been deprecated
Does not contain adequate detail

e.g. missing key steps

This guide is going to outline the basic use case of a many to many relationship, how OBIEE 11g resolves this dilemma and how to successfully implement a bridge table model within the 11g platform.

First thing's first - what is a bridge table and why do we need it?

At its core, bridge table solve the many to many relationship we encounter in many datasets. Many to many relationships in itself are not "bad", but when attempting to conform a data set to a star schema - many to many relationships just do not work. Star schemas assume a one to many (1:N) cardinality from the dimension to the fact. This means "one attribute of a dimension row can be found in many rows of the fact table".

For Example:

One job (job dimension) can be performed by many people

You would see the same JOB_WID repeating in the fact table

One employee (employee dimension) can have many jobs

You would see the same EMPLOYEE_WID repeating in the fact table

One call at a call center(ticket dimension) can have many ticket types

You would see the same CALL_WID repeating in the fact table

One patient (patient dimension) can have many diagnosis

You would see the same PATIENT_WID repeating in the fact table

This 1:N cardinality is represented in OBIEE as (using call center/employee example) :

"Cardinality of '1' applied to the dimension and cardinality of 'N' applied to the fact'

But what happens when in the above examples, the cardinality is actually N:N?

For Example:

Many employees can have multiple jobs and each job can be performed by multiple employees
Many patients can have multiple diagnosis and each diagnosis can be 'assigned' to many patients
Many calls can have multiple call ticket types and each ticket type can belong to multiple calls

This many to many relationship is initially (and incorrectly) represented in OBIEE 11g as:

'Cardinality of '1' is applied to the two dimension tables and cardinality of 'N' is applied to the fact'

Any BI Architect should recognize the above model - it's a traditional star schema! If you stop here and decided to ignore the issue with your dataset and 'hope' OBIEE aggregates the model correctly, you're about to be disappointed.

Why star schemas dont work for N:N cardinality

Consider the following scenario: You're a call center manager and you want to capture the number of calls with positive feedback per employee. You also want to capture the type of calls an employee answers in any given day.

Upon analysis of the requirements you conclude that "each call received by an employee can have many call types and each call type can be answered by multiple employees".

For example:

I answer a take a call that is classified as a 'new call', 'urgent', and 'out of state transfer' (three different call types) - this is the "each call received by an employee can have many call types".
A colleague also received a phone call that is classified as 'out of state transfer' - this is the 'each call type can be answered by multiple employees"

Now let's put this data in a traditional star schema fact table as modeled below:

ID	EMPLOYEE_WID	CALL_TYPE_WID	NUMBER_OF_GOOD_CALLS
1	1	1	300
2	1	2	300
3	1	3	300
4	2	2	500
5	2	3	500
6	3	1	200

You can see in the above data set that:

EMPLOYEE 1:

Has 3 different call types
Has 300 positive reviews (NUMBER_OF_GOOD_CALLS)

This metric is at the EMPLOYEE level and not the call type level!

EMPLOYEE 2:

Has 2 different call types
Has 500 positive reviews (NUMBER_OF_GOOD_CALLS)

This metric is at the EMPLOYEE level and not the call type level

EMPLOYEE 3:

Has 1 different call type

Has 200 positive reviews (NUMBER_OF_GOOD_CALLS)

Now you receive a requirement to create a KPI that displays the Number of Good Calls as a stand alone widget.

PROBLEM 1 - Aggregation :

The number of good calls you received based on the above fact table is not 2100 - it's 300 + 500 + 200 = 1000

Employee 1 received 300 good calls
Employee 2 received 500 good calls
Employee 3 received 200 good calls

but due to the many to many cardinality of the data, the star schema duplicates the measures because each employee can take calls of many call types and each call type can be assigned to many employees!

PROBLEM 2 - Grand Totaling:

What if you don't care about aggregates? What if you just want a report that contains the employee, call type and a summation/grand total?

Notice how NUMBER_OF_GOOD_CALLS is repeated across multiple call types and the grand total is still incorrect. It's being duplicated due to the many to many relationship that exists between call type and employee. Furthermore, it paints an incorrect picture that NUMBER_OF_GOOD_CALLS is some how related to CALL_TYPE

How do we resolve this many to many cardinality with a bridge table?

When all is said and done, the incorrectly built star schema:

should be modified to:

Let's break this down:

The bridge table:

This the purpose of the bridge table is to resolve the many to many relationship between the call type and employee. It will contain, at a minimum, the following four columns:

The primary key of the table
The EMPLOYEE_WID
The CALLTYPE_WID
The weight factor

The weight factor is what's going to resolve the issue of double counting.

If an employee has 3 call types, there would be 3 rows and the weight factor of each row would be .33
If an employee has 10 call types, there would be 10 rows and the weight factor of each row would be .1

In our bridge table data set, we're going to use the same 3 EMPLOYEE_WIDs and create the following:

ID	CALL_TYPE_WID	EMPLOYEE_WID	WEIGHT
11	1	1	0.33
12	2	1	0.33
13	3	1	0.33
23	2	2	0.5
24	3	2	0.5
31	1	3	1

You can see from this example that we've taken the N:N dataset in the fact table and moved it into this bridge.

The dimension that is joined to both the fact and bridge

This is a generic dimension that contains the unique EMPLOYEE IDs in your organization's dataset.

For example:

ID	EMPLOYEE_ID
1	1
2	2
3	3
4	4
5	5
6	6
7	7
8	8
9	9
10	10

The dimension that is joined to only the bridge table

This dimension contains all of the possible call types. Note how this table is not physically joined to the fact. This is because this specific dimension (CALL_TYPE) is what's causing the N:N cardinality

For example:

ID	DESC
1	Call Type 1
2	Call Type 2
3	Call Type 3
4	Call Type 4
5	Call Type 5
6	Call Type 6
7	Call Type 7
8	Call Type 8
9	Call Type 9
10	Call Type 10

The Fact Table

We've moved the N:N cardinality from the original fact table to the bridge table so the new fact table now contains exactly one row per employee and does not have the CALL_TYPE_WID.

ID	EMPLOYEE_WID	NUMBER_OF_GOOD_CALLS
1	1	300
2	2	500
3	3	200

How do we implement this model in OBIEE 11g?

Step 1: Import Tables into Physical Layer

This is always the first step performed when creating a model regardless of its type. In the above example i'm importing four tables:

Step 2: Create the Physical Data Model

Based on our data set above the join conditions would be implemented as follows:

1:N relationship from employee dimension to fact table
1:N relationship from employee dimension to bridge
1:N relationship from call type dimension to bridge

Notice how employee_demo_d is the only dimension that is joined to the fact. w_call_type_d is not joined to the fact because that is the dimension that is causing the many to many relationship issue.

Step 3: Create the Logical Data Model
The creation of the BMM is where we deviate from our standard build steps of a traditional star schema:

All associated dimension tables referencing the bridge table will be stored in a single BMM table
The single BMM table will have two logical table source

Step 3.1 : Drag the fact table and dimension table that is connected to the fact table into the BMM.

In our example, we are dragging w_calls_f and w_employee_demo_d into the BMM:

Step 3.2: Create a 2nd LTS in the existing dimension table

Right click W_EMPLOYEE_DEMO_D -> New Object -> New Logical Table Source
Name it 'Bridge'
Add W_BRIDGE_D and W_CALLTYPE_DEMO_D (the two dimensions not directly joined to the fact table) under the 'Map to these tables' section

Next add the remaining dimension columns from W_CALLTYPE_DEMO_D and W_BRIDGE_DEMO_D to the Dimension table in the BMM

Step 3.3: Create a level-based dimension hierarchy for the dimension BMM

This step should be completed whether or not the schema is a star or bridge

Step 3.4: Confirm the BMM model has a 1:N relationship from the dimension to fact

Step 3.5: Set aggregation rule of NUMBER_OF_GOOD_CALLS to sum

All measures in the BMM must have a mathematical operation applied to the column

Step 3.5: Set the Content level of the dimension table to 'detail' in within the LTS of the fact table

Again, this is something that should always take place regardless of the type of model

Step 4: Create the Presentation Layer

This part is straight forward, just drag the folders from the BMM into the new subject area:

The moment of truth

So why did we go through this elaborate exercise again? To fix the aggregation issues we were having with NUMBER_OF_GOOD_CALLS due to the N:N cardinality of the data set. Let's create that 'standalone KPI' Number of Good Calls:

Notice how the metric correctly sums to 1000. Let's check the back end physical query to confirm:

Notice how it's hitting the fact table and not the bridge or the call type dimension.

But what about the weight factor?

Let's go back to the scenario where we want to compare across dimensions joined via the bridge table (EMPLOYEE and CALL_TYPE):

When creating a report that uses a measure from the fact table, a dimension value from the the employee table, and a dimension value from the table that causes the N:N cardinality - you need to use the weight factor to make sure your measure isn't getting double or triple counted:

Notice column is using the the NUMBER_OF_GOOD_CALLS multiplied by the WEIGHT factor in column 2
Each row in column 1 correctly represents the NUMBER_OF_GOOD_CALLS in the fact table despite having the repeated values of multiple call types
Note the aggregation of grand total sums to 997. This is because the weight factor is rounded to the 2nd decimal for EMPLOYEE_WID = 1 (.33%)

In order for grand totaling to work correctly with bridge table measures that use weight facts you must set the aggregation rule of the column (in this case column 1) to sum within Answers:

So what did we accomplish in this guide?

A basic understanding of many to many (N:N) cardinality
A basic understanding of why the star schema won't work for N:N cardinality
How to resolve the cardinality issue with a bridge table
How to implement a bridge table in OBIEE 11g

Monday, 21 September 2015

Purge Cache Automation in OBIEE 11g

Why do we Purge Cache Automatically?

The cache option in OBIEE 11g can help us to improve query performance greatly. But sometimes when we refresh the data mapping or reload the transaction data, the result over report will be out of date and not refreshed immediately due to cache. The best way is that we clear the cache after we loaded data or modified the RPD. I did some material research and came up with a solution to clear cache automatically – using scheduled tool on ETL server to invoke scripts after ETL load. I’d like to share the steps as following.

Step 1: Clear Cache on Oracle BI

OBIEE 11G has Oracle BI Server utilities nqcmd and NQClient to run test queries against the repository. We can use nqcmd command to clear OracleBIServer cache. The nqcmd utility is available both on Windows and UNIX systems. The syntax of nqcmd command is:

nqcmd -dmy_dsn -umy_username [-pmy_password] -ssql_input_file -omy_result_file

1) Create a file called purgecache.txt and place it at[FMW_HOME]/instances/instance1/bifoundation/OracleBIApplication/coreapplication/setup

In the file, enter the code “call SAPurgeAllCache();” (without the quotes), which is a special BI Server command for clearing the entire cache.

2) Create a shell script called purgecache.sh, place it in a directory where you store your custom scripts which includes the following commands. Note that there are some paths (in bi-init.sh) need to be set before you run nqcmd which are in step 2. The purgecache.sh contains following commands.

source /[FMW_HOME]/instances/instance1/bifoundation/OracleBIApplication/coreappli cation/setup/bi-init.sh
[FMW_HOME]/Oracle_BI1/bifoundation/server/bin/nqcmd -d AnalyticsWeb -u administrator -p password -s [FMW_HOME]/instances/instance1/bifoundation/OracleBIApplication/coreapplication/setup/purgecache.txt

Step 2: Clear Cache in Oracle Presentation Server

OBIEE 11G has a catalog manager command called “ClearQueryCache” to clear out the Presentation Server cache.

The syntax of ClearQueryCache command is:
runcat.cmd/runcat.sh -cmd clearQueryCache -online <OBIPS URL> -credentials <credentials properties file>

1) Create a catalog manager credential properties file. Open a text file and type the following entries.
login = <weblogic_admin_Username>
pwd = <weblogic_Admin_Userpassword>
And save in a directory with the file name as catmancredentials.properties.

2) Open command prompt and navigate to <MW_HOME>instances\instance1\bifoundation\OracleBIPresentationServicesComponent\coreapplication_obips1\catalogmanager\

3) Run the following command to clear OBIPS query cache:
runcat.sh -cmd clearQueryCache -online http://host:port/analytics/saw.dll?-credentials catmancredentials.properties

Step 3: Enable Passwordless Login Option between DAC and OBI Servers

1) Login into server x.x.x.x as user oracle and generate a pair of public keys using following command.

ssh-keygen -t rsa

2) Use SSH from server x.x.x.1 (DAC server) to connect server x.x.x.2 (OBI server), use oracle as user and create .ssh directory under it, use following command.

ssh oracle@x.x.x.1 mkdir -p .ssh

3) Use SSH from server x.x.x.2 and upload new generated public key (id_rsa.pub) on server x.x.x.1 under oracle‘s .ssh directory as a file name authorized_keys.

cat .ssh/id_rsa.pub | ssh oracle@x.x.x.x.2 ‘cat >> .ssh/authorized_keys’

4) Due to different SSH versions on servers, we need to set permissions on .ssh directory and authorized_keys file.

ssh oracle@x..x.x.1 “chmod 700 .ssh; chmod 640 .ssh/authorized_keys”

Step 4: Invoke the PurgeCache script from DAC Server

1) Create the NEW ITEM in the TASK TAB in the DESIGN, input TASK NAME, COMMAND FOR INCREMENTAL LOAD, COMMAND FOR FULL LOAD, TASK PHASE, EXECUTION TYPE and EXECUTING PRIORITY.

EXAMPLE FOR CLEAR OBIEE CACHE:

NAME: Clear OBIEE Cache

COMMAND FOR INCREMENTAL LOAD: sh XXX.sh

COMMAND FOR FULL LOAD: sh XXX.sh

TASK PHASE: post ETL process

EXECUTION TYPE: External Program

EXECUTING PRIORITY: 5

2) Add the NEW TASK in the following tasks in the EXECUTION PLAN.

Following tasks means the tasks which will be executed after the execution plan is done.

3) Create EXECUTION PLAN