the WEM - Web
NEW: Macro- & Micro-Mining: Web server
log file examples. Paper (German)
& Poster (English). Knowledge
eXtended conference Jülich (Germany), 2.-4. November 2005.
the WEM is a small Java application that
analyzes Web Log Files in the combined log file format (e.g. Apache Webserver).
WEM displays the top 100 entry pages/entry points of a Website and their entries
distinguished into the following entry types
- search engine (SE-Entries) - website entries
over search engine queries (e.g. www.google.com/search?hl=en&q=university+berlin),
- backlink or external link (R-Entries) - website
entries over backlinks (e.g. www.hu-berlin.de),
- direct access (D-Entries) - website entries
over direct navigation (e.g. "-").
WEM has implemented the concept of WEF (Web Entry
Factors). See masters thesis by Philipp Mayr http://www.ib.hu-berlin.de/~kumlau/handreichungen/h129/
The only restriction: the log files have to be from an Apache Webserver and
the referer field has to be recorded (combined log format).
To test WEM please download the executable
file and analyse your log data locally. You can analyze your own data or anonymized
- Download the WEM [wem.exe,
- Download Sample Log File 1 [sample1.log,
1.6 mb], Download Sample Log File 2 [sample2.log,
Download Sample Log File 3 [sample3.log, 2.2 mb]
Note: the sample logs are only for demonstration. If you like to test the
samples please type www.ib.hu-berlin.de as domain url.
Larger samples are required for better unterstand of WEM.
To run WEM please
- Start the exe-file.
- Select File>>New Analysis over the menu
- Enter the domain of the website you want to
- Select a log file (Note: only from an Apache
Webserver). Note: WEM needs raw log files. Please unzip compressed archives
- Click Start Analysis
- Wait a moment! A status bar displays mining
- Select pages/URL's on the top 100 pages list
(left). On the right you will see the proportional rates of the three entry
contact: Philipp Mayr, Home