Reverse Proxy Module

The Reverse Proxy Module provides the features of Reverse Proxy, and it consists of HTTP Client builder components, Reverse Proxy Command/Chain components, and Reverse Proxy Servlets and Filters.

By using Reverse Proxy Module, you can serve more sophisticated content especially with a custom content rewriter, and you can also allow Cross-Domain Scripting for trusted applications.

Installtion

If you use Apache Maven in your project, then you can add the following dependency to use this module in your project.

    <dependency>
      <groupId>org.apache.portals.applications</groupId>
      <artifactId>apa-webcontent2-reverse-proxy</artifactId>
      <version>${webcontent2.version}</version>
    </dependency>
          

For more information on developing/testing, see README file.

SimpleReverseProxyServlet

A simple Reverse Proxy servlet can be configured in the web.xml like the following example:

  <!-- Reverse Proxy Servlet -->
  <servlet>
    <servlet-name>ReverseProxyServlet</servlet-name>
    <servlet-class>org.apache.portals.applications.webcontent2.proxy.servlet.SimpleReverseProxyServlet</servlet-class>
    <init-param>
      <param-name>mappings</param-name>
      <param-value>
        /WEB-INF/rproxy-mappings.yaml
      </param-value>
    </init-param>
    <init-param>
      <param-name>ssl-hostname-verifier</param-name>
      <param-value>ALLOW_ALL_HOSTNAME_VERIFIER</param-value>
    </init-param>
  </servlet>

  <!-- Map /rproxyservlet/* path to the Reverse Proxy Servlet -->
  <servlet-mapping>
    <servlet-name>ReverseProxyServlet</servlet-name>
    <url-pattern>/rproxyservlet/*</url-pattern>
  </servlet-mapping>
            

The servlet (org.apache.portals.applications.webcontent2.proxy.servlet.SimpleReverseProxyServlet) can have the following init parameter(s):

Name Default Value Example Value Description
mappings /WEB-INF/rproxy-mappings.xml YAML Configuration for path mappings and reverse path mappings.
This parameter value can be any of the following:
  • File path resource prefixed by 'file:'.
  • Classpath resource prefixed by 'classpath:'.
  • Context relative path resource prefixed by '/'.
  • YAML string

Note: Variables enclosed by '${' and '}' are expanded by Java System properties. For example, you would get an expanded string, '/home/user1/rproxy-mappings.xml' from '${user.home}/rproxy-mappings.xml' if the user's home directory is '/home/user1'.
ssl-hostname-verifier BROWSER_COMPATIBLE_HOSTNAME_VERIFIER ALLOW_ALL_HOSTNAME_VERIFIER The init parameter can be any of "ALLOW_ALL_HOSTNAME_VERIFIER", "BROWSER_COMPATIBLE_HOSTNAME_VERIFIER" or "STRICT_HOSTNAME_VERIFIER", case-insensitively.

The Reverse Proxy mapping configuration in the example above can be like the following:

--- !simple
local: /portals/applications/
remote: http://portals.apache.org/applications/
contentRewriters:
    text/html: !!org.apache.portals.applications.webcontent2.proxy.rewriter.DefaultReverseProxyTextLineContentRewriter []

--- !simple
local: /portals/bridges/
remote: http://portals.apache.org/bridges/
contentRewriters:
    text/html: !!org.apache.portals.applications.webcontent2.proxy.rewriter.DefaultReverseProxyTextLineContentRewriter []

--- !simple
local: /localhost/examples1/
remote: //localhost:8080/webcontent2/examples1/
contentRewriters:
    text/html: !!org.apache.portals.applications.webcontent2.proxy.rewriter.DefaultReverseProxyTextLineContentRewriter []

--- !regex
localPattern: ^/apache/(\w+)/(.*)$
remoteReplace: http://$1.apache.org/$2
remotePattern: ^http://(\w+)\.apache\.org/(.*)$
localReplace: /apache/$1/$2
contentRewriters:
    text/html: !!org.apache.portals.applications.webcontent2.proxy.rewriter.DefaultReverseProxyTextLineContentRewriter []

--- !regex
localPattern: ^/localhost/examples2/(.*)$
remoteReplace: //localhost:8080/webcontent2/examples2/$1
remotePattern: ^https?://localhost:8080/webcontent2/examples2/(.*)$
localReplace: /localhost/examples2/$1
contentRewriters:
    text/html: !!org.apache.portals.applications.webcontent2.proxy.rewriter.DefaultReverseProxyTextLineContentRewriter []

--- !simple
local: /
remote: http://apache.org/
contentRewriters:
    text/html: !!org.apache.portals.applications.webcontent2.proxy.rewriter.DefaultReverseProxyTextLineContentRewriter []
          

SimpleReverseProxyFilter

You can use a servlet filter instead with the same Reverse Proxy Maping configuration like the following example:

  <!-- Reverse Proxy Filter -->
  <filter>
    <filter-name>ReverseProxyFilter</filter-name>
    <filter-class>org.apache.portals.applications.webcontent2.proxy.filter.SimpleReverseProxyFilter</filter-class>
    <init-param>
      <param-name>filterPath</param-name>
      <param-value>/rproxyfilter</param-value>
    </init-param>
    <init-param>
      <param-name>mappings</param-name>
      <param-value>
        /WEB-INF/rproxy-mappings.yaml
      </param-value>
    </init-param>
    <init-param>
      <param-name>ssl-hostname-verifier</param-name>
      <param-value>ALLOW_ALL_HOSTNAME_VERIFIER</param-value>
    </init-param>
  </filter>

  <!-- Map /rproxyfilter/* path to the Reverse Proxy Filter -->
  <filter-mapping>
    <filter-name>ReverseProxyFilter</filter-name>
    <url-pattern>/rproxyfilter/*</url-pattern>
    <dispatcher>REQUEST</dispatcher>
    <dispatcher>INCLUDE</dispatcher>
    <dispatcher>FORWARD</dispatcher>
  </filter-mapping>
            

The servlet filter (org.apache.portals.applications.webcontent2.proxy.filter.SimpleReverseProxyFilter) can have the following init parameter(s):

Name Default Value Example Value Description
mappings /WEB-INF/rproxy-mappings.xml YAML Configuration for path mappings and reverse path mappings.
This parameter value can be any of the following:
  • File path resource prefixed by 'file:'.
  • Classpath resource prefixed by 'classpath:'.
  • Context relative path resource prefixed by '/'.
  • YAML string

Note: Variables enclosed by '${' and '}' are expanded by Java System properties. For example, you would get an expanded string, '/home/user1/rproxy-mappings.xml' from '${user.home}/rproxy-mappings.xml' if the user's home directory is '/home/user1'.
ssl-hostname-verifier BROWSER_COMPATIBLE_HOSTNAME_VERIFIER ALLOW_ALL_HOSTNAME_VERIFIER The init parameter can be any of "ALLOW_ALL_HOSTNAME_VERIFIER", "BROWSER_COMPATIBLE_HOSTNAME_VERIFIER" or "STRICT_HOSTNAME_VERIFIER", case-insensitively.

Configuring Reverse Proxy Mappings

In a Reverse Proxy Mappings configuration file, you can list all the (reverse) path mapping configurations in YAML documents format.

At the moment, two built-in mapping configuration types are supported by default:

  • Simple mapping
  • Regular Expression based mapping

Note: YAML configurations are internally parsed by using SnakeYAML library. So, the document type hints and constructor hints are handled by SnakeYAML.

Simple mapping

Simple mapping allows you to map a local path to a remote URL by replacing the configured local path prefix by the configured remote URL prefix. And a remote URL is mapped to a local path by replacing the configured URL prefix by the configured local path prefix.

Note: A simple mapping should start with the line, '--- !simple', to denote a new YAML document with a built-in document type hint ('simple').

For example, a simple mapping can be configured like the following:

--- !simple
local: /portals/applications/
remote: http://portals.apache.org/applications/
          

In this simple mapping, for example, if the context path is '/webcontent2' and the request context relative path is '/portals/applications/a/b/c.html' (e.g, 'http://localhost:8080/webcontent2/portals/applications/a/b/c.html'), then the resolved remote URL will be 'http://portals.apache.org/applications/a/b/c.html'.

You can also set which content writer components should do rewrite the remote content by setting 'contentRewriters' to a YAML map like the following example:

--- !simple
local: /portals/applications/
remote: http://portals.apache.org/applications/
contentRewriters:
    text/html: !!org.apache.portals.applications.webcontent2.proxy.rewriter.DefaultReverseProxyTextLineContentRewriter []
          

With the example configuration above, you will create org.apache.portals.applications.webcontent2.proxy.rewriter.DefaultReverseProxyTextLineContentRewriter instance as content rewriter for 'text/html' content type from the remote content.

Note: !!org.apache.portals.applications.webcontent2.proxy.rewriter.DefaultReverseProxyTextLineContentRewriter [] is interpreted by SnakeYAML in a special way: SnakeYAML instantiates org.apache.portals.applications.webcontent2.proxy.rewriter.DefaultReverseProxyTextLineContentRewriter class instance with the default constructor (as specified by '[]'). Please see SnakeYAML homepage for details.

You can also set a scheme-less remote URL prefix like the following example. In this case, the default URL scheme is inferred from the current servlet request (e.g, javax.servlet.ServletRequest#getScheme()).

--- !simple
local: /localhost/examples1/
remote: //localhost:8080/webcontent2/examples1/
contentRewriters:
    text/html: !!org.apache.portals.applications.webcontent2.proxy.rewriter.DefaultReverseProxyTextLineContentRewriter []
          

Regular Expression based mapping

Regular Expression based mapping allows you to map a local path to a remote URL by matching a local path by the configured local path pattern ('localPattern') and replacing it with the configured 'remoteReplace' string. And a remote URL is mapped to a local path by matching a remote URL by the configured remote URL pattern ('remotePattern') and replacing it with the configured 'localReplace' string.

Note: A regular expression based mapping should start with the line, '--- !regex', to denote a new YAML document with a built-in document type hint ('regex').

For example, a Regular Expression based mapping can be configured like the following:

--- !regex
localPattern: ^/apache/(\w+)/(.*)$
remoteReplace: http://$1.apache.org/$2
remotePattern: ^http://(\w+)\.apache\.org/(.*)$
localReplace: /apache/$1/$2
          

In this Regular Expression based mapping, for example, if the context path is '/webcontent2' and the request context relative path is '/apache/portals/a/b/c.html' (e.g, 'http://localhost:8080/webcontent2/apache/portals/a/b/c.html'), then the resolved remote URL will be 'http://portals.apache.org/a/b/c.html'.

You can also set which content writer components should do rewrite the remote content by setting 'contentRewriters' to a YAML map as well like the following example:

--- !regex
localPattern: ^/apache/(\w+)/(.*)$
remoteReplace: http://$1.apache.org/$2
remotePattern: ^http://(\w+)\.apache\.org/(.*)$
localReplace: /apache/$1/$2
contentRewriters:
    text/html: !!org.apache.portals.applications.webcontent2.proxy.rewriter.DefaultReverseProxyTextLineContentRewriter []
          

With the example configuration above, you will create org.apache.portals.applications.webcontent2.proxy.rewriter.DefaultReverseProxyTextLineContentRewriter instance as content rewriter for 'text/html' content type from the remote content.

Note: !!org.apache.portals.applications.webcontent2.proxy.rewriter.DefaultReverseProxyTextLineContentRewriter [] is interpreted by SnakeYAML in a special way: SnakeYAML instantiates org.apache.portals.applications.webcontent2.proxy.rewriter.DefaultReverseProxyTextLineContentRewriter class instance with the default constructor (as specified by '[]'). Please see SnakeYAML homepage for details.

You can also set a scheme-less remote URL prefix like the following example. In this case, the default URL scheme is inferred from the current servlet request (e.g, javax.servlet.ServletRequest#getScheme()).

--- !regex
localPattern: ^/localhost/examples2/(.*)$
remoteReplace: //localhost:8080/webcontent2/examples2/$1
remotePattern: ^https?://localhost:8080/webcontent2/examples2/(.*)$
localReplace: /localhost/examples2/$1
contentRewriters:
    text/html: !!org.apache.portals.applications.webcontent2.proxy.rewriter.DefaultReverseProxyTextLineContentRewriter []
          
Note: The remotePattern can start with a specific scheme ("http" or "https") pattern because the remotePattern is evaluated always on the current specific remote target URL or redirection location URL.

Extending the Default Reverse Proxy Service

In order to maximize the extensibility, Reverse Proxy Module is implemented with the Chain of Responsibility pattern by using Apache Commons Chains module.

By default, org.apache.portals.applications.webcontent2.proxy.builder.DefaultProxyProcessingChainBuilder initializes and adds all the common commands to the internal reverse proxy service component. However, you can always modify the chains of the commands in your extended servlet/filter/portlet classes.

For example, org.apache.jetspeed.portlets.sso.SSOReverseProxyServlet in j2-admin project extends org.apache.portals.applications.webcontent2.proxy.servlet.SimpleReverseProxyServlet in order to replace the default org.apache.portals.applications.webcontent2.proxy.command.InitHttpRequestCommand by org.apache.jetspeed.portlets.sso.SSOInitHttpRequestCommand. Also, SSOReverseProxyServlet customizes the default HttpClientContextBuilder by a custom one, JetspeedHttpClientContextBuilder, in order to build custom authentication states based on the Jetspeed SSO Site credentials.