WebClippingPortlet


Class Name : org.apache.jetspeed.portal.portlets.WebClippingPortlet


Description

Present one ore more clipped parts of one or more web pages in a portlet. The clipped part is a section of a web page between a start tag and a stop tag. This portlet supports nested tags:

<table> ... <table> ... </table> ... </table>

In this case you can clip the complete table, or only the inner table.

Example of how to find a tag:

Find: <img src="http://www.apache.org/images/asf_logo_wide.png" width="537" height="51" alt="Apache Software Foundation"/>

If it's the first image (or there's only one image) in the page, you can clip it with:

Tag...="img"

But you can also find the image with:

Tag...="img src=http://www.apache.org/images/asf_logo_wide.png width=537 height=51 alt=Apache Software Foundation"

Tag...="img src=http://www.apache.org/images/asf_logo_wide.png"

Or (simpler!):

Tag...="img asf_logo_wide.png"

Tag...="img 537 Apache"

The match is between the tags (img) and any portion of its attributes (width=537, 537, asf_logo_wide.png, ...)

Supported Media Types

Description of Media Types.

  • html

Element: parameter

Parameters that control what parts of the web pages are converted into a portlet. All these parameters are followed by a number: startTag1, startTag2, ...

This element is required

Parameters common to many portlets.

Parameter NameDescription
startTag...First tag to include in the portlet
stopTag...Last tag to include in the portlet
Tag...Single tag to include in the portlet, ie useful to include images. You must indicate or a couple startTag.. -stopTag.., or only a Tag.. parameter
startTagNumber...If the page contains some identical tags, you can select one of them using startTag (or Tag) and startTagNumber.

Example: in the page there are four paragraphs, you can select the second with startTag..="p" and startTagNumber..="2"
url...It's possible to specify a different URL for each element to clip. If it's not specified, the default URL is used. It's also possible to use a local URL, ie:

/examples/local-page.html

Element: url

Contains the default URL of the web page to be displayed. It's possible to use a local URL.

This element is required

General information about of URL.

Good URLs:

<url>http://jakata.apache.org/jetspeed</url>
Basic URL
<url>/examples/local-page.html</url>
Local URL
<url>http://search.yahoo.com/bin/search?p=jetspeed</url>
1 Parameter passed
<url>http://search.yahoo.com/search?p=jetspeed&amp;n=100</url>
2 Parameter passed

Bad URLs:

<url>http://search.yahoo.com/search?p=jetspeed&n=100</url>
Contains & instead of &amp;.

Example of Registry Entry

The example will show local content, an image from Jakarta, and a clipped part of Tomcat home page

From <jetspeed_home>/WEB-INF/conf/WebClipping.xreg

      <portlet-entry name="Different URLs example" hidden="false" type="ref" parent="WebClippingPortlet" application="false">
        <meta-info>
            <title>Different URLs example</title>
            <description>Example of clipping from different URLs</description>
        </meta-info>
        <parameter name="startTag1" value="p" hidden="false"/>
        <parameter name="stopTag1" value="p" hidden="false"/>
        <parameter name="Tag2" value="img" hidden="false"/>
        <parameter name="url2" value="http://jakarta.apache.org/" hidden="false"/>
        <parameter name="startTag3" value="p" hidden="false"/>
        <parameter name="stopTag3" value="p" hidden="false"/>
        <parameter name="startTagNumber3" value="2" hidden="false"/>
        <parameter name="startTag4" value="table" hidden="false"/>
        <parameter name="stopTag4" value="table" hidden="false"/>
        <parameter name="startTagNumber4" value="3" hidden="false"/>
        <parameter name="url4" value="http://jakarta.apache.org/tomcat/" hidden="false"/>
        <url>/examples/local-content.html</url>
        <category>sites</category>
      </portlet-entry>

Known problems (version 1.4b4)

  • No known problems, but more testing needed