Previous Next Contents Generated Index Doc Set Home


CHAPTER 18

Internet Smart Code





Introduction

Choosing "Internet" Smart Code for a callback causes Sun WorkShop Visual to generate a client application from your design. Internet Smart Code programming is about accessing pre-existing Web pages and CGI programs on public servers across the World Wide Web. The Smart Code callback appears in the client application (in a sub-directory named "callouts"). Figure 18-1 shows the structure of an application generated by Sun WorkShop Visual when an "Internet" callback is defined. Unlike thin client Smart Code, only the client application and communication code is generated for this type of callback.

You will need to understand how to use both Groups and Get/Set Smart Code in order to use Internet (or thin client) Smart Code because Groups, along with their getters and setters, are the nuts and bolts of all types of Smart Code. Information on these subjects is found in:

  1. The grouping together of widgets is described in Chapter 15, "Groups", starting on page 479.
  2. Chapter 16, "Get/Set Smart Code", starting on page 487 describes the Get/Set Smart Code which provides you with toolkit-independent wrappers for the widgets in your design.
The "Go Live" feature allows you to use Sun WorkShop Visual's dynamic display as a prototype client in order to test your interface on live data as you are developing it. The tutorial starting on page 540 shows you how to do this and how to generate the application using a very simple example.

The use of "Go Live" for Internet Smart Code prototyping is limited because you will need to write your own Receive Handler to process and act on the incoming data.

FIGURE  18-1 Internet Callback Application Structure


Internet and Thin Client

Internet and thin client Smart Code are similar:

They both use the toolkit-independent getters and setters.
They both use HTTP (generated through the Sun WorkShop Visual URL API) as the means of communication.
They both make your design into a client application.
These similarities mean that parts of their user interface within Sun WorkShop Visual are shared. In order to use Internet, you may need to refer to the following sections which can be found in Chapter 17 "Thin Client Smart Code":

"Customizing the Server Connection" on page 516.
"Going Live" on page 524.
"Generated Code" on page 529.
Internet is, however, distinct from thin client:

For Internet Smart Code, no server application is generated.
The GET HTTP protocol is used for Internet Smart Code (rather than POST).
Because Internet designs are assumed to be thick clients, the way you structure them will be different.
Applications generated with Internet Smart Code might be used to:

Fetch and parse the contents of a World Wide Web page.
Connect to a pre-existing remote server
Communicate with a server generated from another Sun WorkShop Visual design.


Receiving Data

For applications generated with Internet Smart Code you will need to provide a Receive Handler. Sun WorkShop Visual gives you a pointer to any data returned, but it is up to you to handle that data. Data handlers are part of the Customize dialog and described in "Customizing the Server Connection" on page 516. To make use of the data returned, Sun WorkShop Visual provides a library which allows you to express an interest in particular features of the input stream and then "pick out" these features as they arrive. This is described in "Extracting Information from HTML Data" on page 546.

You can either process the data as a stream or you can use the InputData class or object Sun WorkShop Visual provides to access it through the getData() and getSize() methods. This is particularly useful if you are downloading data to send to a display widget - for example, a gif or jpeg image. For C code, the InputData object is a data structure. For C++ and Java it is a class. See the online reference material for details of InputData by opening this file in an HTML browser and following the appropriate links:

$VISUROOT/lib/locale/<YourLocale>/sc/index.html

where VISUROOT is the install directory of your Sun WorkShop Visual and <YourLocale> is the locale you are using.

For a simple example of how to process incoming data, generate code from your design after setting up an Internet Smart Code callback with the "Go Live" toggle set which uses the "@<widgetname>" shorthand notation as the Receive Handler. This is exactly what you will do in "Simple Internet Smart Code Tutorial" on page 540 below.


Communication Protocol

Sun WorkShop Visual assumes, if you have chosen "Internet" Smart Code, that you are fetching data from a location on the Internet and therefore uses the GET HTTP protocol. If you override the send handler by specifying a function name for it in the Customize dialog, Sun WorkShop Visual uses the POST protocol.


Simple Internet Smart Code Tutorial

This example introduces you simply and quickly to Internet Smart Code. It connects to a real Web site and downloads data from it. You will see this happening both within Sun WorkShop Visual, using "Go Live", and in your generated application.

In order to get you familiar with the use of Internet Smart Code quickly, this example does not attempt to parse the returned data. Parsing of HTML is described in "Extracting Information from HTML Data" on page 546.


Note - This tutorial shows you how to connect to a remote server, so make sure you are working from a computer configured to do this.
  1. Create a hierarchy containing the widgets shown in Figure 18-2.
  These are: application shell->form->{button, scrolled text}.

FIGURE  18-2 Hierarchy for Internet Tutorial

  2. In the Layout Editor, attach the scrolled text widget to the bottom and right edges of the form so that it resizes when the window is resized.
  This is a purely cosmetic step - so that you are able to see the returned data more easily.
  3. Select text1 (the text area of the scrolled text widget). Press the "Add to New Group" on the toolbar.
  This button is shown in Figure 18-3.

FIGURE  18-3 Add to New Group Toolbar Button

  4. When the Group Editor appears, check that it shows a group named Group0 containing a text widget as its only member, as shown in Figure 18-4.

FIGURE  18-4 Group Editor

  5. Close the Group Editor.
  We do not need to make any changes, we shall use the Group as created by Sun WorkShop Visual.
  6. Select button1 and display the Callbacks dialog.
  7. Check that "Activate" is selected from the list on the left and put in goInternet as the name of the callback.
  Do not add this callback yet as we have to define the Smart Code for it.
  8. Set the "Smart Code" toggle.
  9. Choose "Internet" form the option menu of Smart Code flavors.
  10. Select Group0 as the Group for this callback.
  Do this by pressing the "Group" toggle, making sure that Group0 is selected and pressing "Apply".
  11. Press the "Customize" button.
  This displays the Customize dialog.
  12. In the Customize dialog, type the following URL in the URL field:
  http://www.ist.co.uk/index.html
  13. If you are behind a firewall, set the Proxy host and port.
See "More on Proxies" on page 519 for more information on setting your proxies.

  14. In the Receive handler field put the following:
  @text1
See "Going Live" on page 524 for information on the use of "@" in these fields.

  15. Press "Ok" in the Customize dialog.
  The completed Customize dialog is shown in Figure 18-5.


Note - The Customize dialog shows fictitious proxies as an example - you must enter those which are relevant to your network, as described in "More on Proxies" on page 519.

FIGURE  18-5 Completed Customize Dialog

  16. Press "Add" to add your new callback.
  17. Still in the Callbacks dialog, set the "Go Live" toggle.
  When you set "Go Live" you do not need to "Update" the callback. The callback is immediate "Live".
  18. In the dynamic display, press button1.
  There is a a pause while the connection is made with the remote server, then the returned data (which is the Web page specified in the Customize dialog) appears in text1, as shown in Figure 18-6.

FIGURE  18-6 Live Dynamic Display

The final stage of this tutorial shows you the same occurring in the generated application.

  19. Generate code for your Internet enabled design.
  You can generate any flavor of code, as long as you are able to compile it.
  20. Compile the generated code.
  21. Run your client application.
  Your application connects to the remote server and displays the specified Web page.

Going a Step Further

Having completed this tutorial, you may like to try some more advanced features of thin client and Internet Smart Code. Provided as part of your Sun WorkShop Visual package are HTML files containing instructions for running supplied Sun WorkShop Visual Replay scripts which run the tutorials for you. You simply watch it running and then examine the results. Open the following files in an HTML browser:

  1. $VISUROOT/lib/locale/<YourLocale>/sc/timex.html. This describes the "Server Push" tutorial, demonstrating how to create an application with automatic remote update.
  2. $VISUROOT/lib/locale/<YourLocale>/sc/parsex.html. This describes how to create an application which fetches a Web page and then parses it.

Note - VISUROOT is the install directory of your Sun WorkShop Visual and "YourLocale" is the name of the locale you are using.


Extracting Information from HTML Data

An application developed with Internet Smart Code might be used to fetch a Web page, parse it and display the result. To help you organize any HTML data returned by a server and to considerably simplify the process, a full, yet simple to use, HTML parser is supplied with Sun WorkShop Visual.

As a result of the origins and intentions of the World Wide Web, most of the data fetched from Web servers will be in HTML. The parser is based on the reference SGML parser materials from the SGML User Group1. It has been adapted to produce a general purpose SGML parser engine. SGML works in conjunction with a DTD (Document Type Definition) to define a markup language. The DTD for HTML is supplied with Sun WorkShop Visual

By adding extra DTDs, you can use the parser with other standard and in-house markup languages. You will also be able to upgrade your application for future versions of HTML and for XML.

The SGML parser has a simple and convenient programming interface. You register your interest in one or more features of the input stream (i.e. the HTML tags) and a routine of your choosing is called whenever the parser finds one of these features. This is analogous to the widget callback mechanism - widgets register their interest in certain actions and a given routine is called when such actions occur.


Note - If you are not familiar with the Web technology which this uses (or you are confused by the list of acronyms), you may need to do some background reading. See "Books on Networking and World Wide Web" on page 892, for a list of suggested books.
To tell Sun WorkShop Visual that you wish to use the parser to extract key information from the incoming data, set the "SGML/HTML Parsing" toggle in the Customize dialog. In your Receive Handler, set up the parser according to your requirements and then send the data to the parser. Exactly how to do all of this (and what happens next) is detailed below.


Using the Parser

Once you have told Sun WorkShop Visual that you wish to process SGML/HTML by selecting the toggle in the Customize dialog, the following four steps are needed. Each of these takes place inside your Receive Handler:

  1. Register the MIME type by calling the routine scRegisterSGMLMimeType (or the shortcut for HTML scRegisterHTML).
  2. Register an error handler by calling the routine scRegisterSGMLMimeErrorHandler. This is an optional step.
  3. Register an interest in one or more features of the input stream by calling the routines scAddTagCallback and scAddAttrCallback. Alternatively at this point you can request a traditional parse tree.
  4. Call the parser using the routine scProcessSGML.
Each of these steps is examined more closely in the following sub-sections.

Before programming the interface to the parser, make sure that you are including the following header file:

#include <SGML.h>
The directory of this header file, which is part of the Sun WorkShop Visual distribution, is automatically included in the Makefile.

In addition, you will need to set the DTDDIR environment variable to:

$VISUROOT/src/sgml/dtds
"Practical Information for Using the Parser" on page 556 provides some more information on the location of the SGML parser and the files it uses.


Registering the MIME Type

In order to configure the parser, you first need to create an SGML object. This object is then passed to any other routines you need to call. An SGML object is returned from the routine you call to register the MIME type of your data, which is shown below:

SGML_t *
scRegisterSGMLMimeType( mimetype, dtd)
        char * mimetype;
        char * dtd;
Use this to associate a MIME type with an SGML DTD. The most common will be:

SGML_t * sgm = scRegisterSGMLMimeType( "text/html", "HTML32.soc");
Because this is the most commonly used, the following is supplied as a shortcut:

SGML_t *
scRegisterHTML( mimetype)
        char * mimetype;
This does exactly the same as the one above, associating "text/html" with the HTML32 DTD. Add your own DTD by placing it in the directory referenced by the DTDDIR environment variable.

The following shows what is generated when "processMyData" has been specified as the Receive Handler, with "SGML/HTML Parser" set. A line has been added to register the MIME type:

int
processMyData ( data, idata)
        sc_stdcs_t* data;
        sc_idata* idata;
{
	extern InputData * newInputData();




	group0_t * group = (group0_t*)data->group;
	InputStream * i   = (*idata->getInputStream)( idata);
#if 0 /* example usage */
	char      * type  = (*idata->getMimeType)(idata);
	int         len   = (*idata->getContentLength)( idata);




	InputData * id    = newInputData( i);
	char *   d        = (*id->getData)( id);
#endif
	

sgm = scRegisterHTML( type);
	...
	return 0;
}

Registering an Error Handler

The default error handler outputs error messages to standard output. You can override this by registering your own error handler using the following routine:

int
scRegisterSGMLErrorHandler( errorhandler)
        void_f errorhandler;
Your error handler should be of the form:

void
errorhandler( s)
        char * s;

Registering Interest in Input Stream Features

To access the parsed data, you should register an interest in one or more features of the input (e.g. particular tags and attributes in HTML).

Registering interest in input features is directly analogous to the widget callback mechanism where widgets register their interest in certain actions and a specified routine (callback) is called when the action occurs. Here, you register your interest in features of the language and the parser calls your callback routines when it comes across one of these features.

There are two major features of HTML: tags and attributes. You can register an interest in these features using one of the two routines described in subsequent sections. First, though, a brief description of what is meant by tag and attribute.


Tags

Tags are features of HTML which describe the format of the following piece of text. Tags appear in angle brackets, for example <menu> to indicate a bulleted list or <code> to indicate a code listing. The following example shows a "menu" block containing individual list items:

<menu>
	<li>The first item in the list
	<li>The second item in the list
</menu>

Attributes

Attributes are another feature of HTML. They also appear inside angle brackets. Attributes are placed after the tag and are used to indicate a reference. This may be an external file, an image or a position elsewhere in the document. Attributes are always made up of a reserved string, an equals sign (=) and the reference. The following example shows two attributes. The first, an "href", names the destination of a link (somewhere called "bottom"). The text inside the block, "Go to bottom of page", is the "visible" part of this link. The second attribute is a "name". It names a location - in this case "bottom". So, from a user point of view, selecting "Go to bottom of page", moves the view to the named location "bottom":

<a href="#bottom"><b>Go to bottom of page</b></a>
...
<a name="bottom"></a>
There are two routines which you can use to register your interest in particular HTML language features. These are:

  1. scAddTagCallback. Use this to register an interest in a particular tag.
  2. scAddAttrCallback. Use this to register an interest in an attribute. This is the same as scAddTagCallback but configured to show an interest only in attributes.
The following sections explain these registration routines.


Registering Interest in Tags

The SGML parser needs to know which parts of the HTML input you are interested in. It also needs to know at what point within the selected block of HTML to call your callback routine.

The routine for registering interest in tag elements is:

int
scAddTagCallback( sgm, tagname, type, callback, data)
	SGML_t * sgm;
	char * tagname;
	int type;
	void (*callback)();
	void * data;
The parameters to this routine need more explanation. They are detailed in the following sub-sections.


SGML_t * sgm

This is the SGML object returned from the scRegisterSGMLMimeType, the routine for registering the MIME type of your data. scRegisterSGMLMimeType is described in "Registering the MIME Type" on page 547.


char * tagname

This is the tag in which you are interested. Do not include the angle brackets, simply the tag itself in upper case e.g. "A" or "MENU" or "LI".


int type

This parameter tells the parser when within the chosen tag to call your callback routine. In addition, which "type" you choose determines whether your callback routine is passed any of the data from inside the selected block of HTML or not. Your callback routine always receives the tagname and the type (so that you can use the same routine for any number of different tags and types) but only "ON_ATTR" and "ON_DATA" cause any more information to be returned.

You have a choice of four pre-defined types, according to where in the tag block you wish your callback routine to be called as illustrated in Figure 18-7. The four types are:

ON_ENTRY. This refers to the beginning of a block (e.g. <a> or <menu>). No data (referred to as call_data in "Your Callback Routine" on page 552) is passed to your routine.
ON_EXIT. This refers to the end of a block (e.g. </a> or </menu>). No data (referred to as call_data in "Your Callback Routine" on page 552) is passed to your routine.
ON_ATTR. This refers to the attribute appearing inside a tag definition (e.g. href="mylink"). Your callback routine receives the text inside the quotation marks as its call_data parameter (as explained in "Your Callback Routine" on page 552).
ON_DATA. This refers to the text (or data) after the tag. Your callback routine receives the text between the beginning and the end of the tag as its call_data parameter (as explained in "Your Callback Routine" on page 552).

FIGURE  18-7 Types


void (*callback)()

This is the name of the routine which the parser should call when it comes across a tag you are interested in (the callback routine). This is a routine defined by you. The format of this routine is described in "Your Callback Routine" on page 552.


void * data

This parameter gives you a chance to pass data to your callback routine. This will be passed to your routine as its "client data" parameter.

You may register an interest in any number of tags, calling this routine once for each tag.


Registering Interest in Attributes

The routine for registering interest in attributes is:

int
scAddAttrCallback( sgm, tagname, attrname, callback, data)
	SGML_t * sgm;
	char * tagname;
	char * attrname;
	void (*callback)();
	void * data;
The only parameter which is different from those described for the tag registering routine above, is:

char * attrname. This is the name of the attribute in which you are interested.
You may register an interest in any number of attributes.


Your Callback Routine

There is no stub file for your callback routine. You must write it all yourself. The name of the routine is the name specified as the fourth parameter to scAddAttrCallback or scAddTagCallback (that is, the parameter called "callback").

Your callback routine is called by the parser when it detects a tag or attribute in which you have registered an interest. The following example shows what your callback should look like and lists the parameters passed in:

int
mycallback( tag, attribute, type, call_data, client_data)
        char * tag;
        char * attribute;
        int    type;
        void * call_data;
        void * client_data;
The parameters passed into your routine are:

  1. char * tag. This is the tag which the parser detected.
  2. char * attribute. This is the attribute which the parser detected. If you were only interested in tags, this is null.
  3. int type. This is whether the routine has been called "ON_ENTRY", "ON_EXIT", "ON_DATA" or "ON_ATTR". These four are discussed in "int type" on page 551. This parameter is not relevant when you are interested in attributes.
  4. void * call_data. When you are interested in attributes, this is the part of the attribute which comes after the equals sign. For example, if you have registered an interest in the "href" attribute and the parser finds the following line:
<a href="#regmime">
then this parameter would be "#regmime".

If you have specified "ON_DATA" as your tag of interest, this parameter is the data after the tag. See Figure 18-7 on page 551 for an illustration.

  5. void * client_data. This is the "data" parameter passed to the registration routine, allowing you to pass your own data into the callback. This is similar to the client data for Xt callbacks, as seen in the Callbacks dialog.
Because you may register an interest in any number of tags or attributes, you can also have any number of callback routines, but having one for tags and another for attributes is probably the most useful combination.


Parsing the Input Stream

Once you have configured an SGML object by specifying the MIME type, registering error handlers and registering interest in particular input stream features, you are ready to call the parser. Here is the routine to do this:

int
scProcessSGML( sgm, istream)
	SGML_t * sgm;	/* the parser handle scRegisterSGMLMimeType */
	InputStream * istream;	/* the input stream from the server */
The first parameter is described in the preceding sections. The second parameter, the input stream, is passed to your Receive Handler.


Using the Parser - Example

This section provides an illustration of the use of the SGML parser. When you specify in the Customize dialog that you wish to have SGML parsing, you also need to provide a name for a Receive Handler. Here is the stub for the handler which is generated by Sun WorkShop Visual:

int
processMyData ( data, idata)
        sc_stdcs_t* data;
        sc_idata* idata;
{
        extern InputData * newInputData();




        group0_t * group = (group0_t*)data->group;
        InputStream * i   = (*idata->getInputStream)( idata);
#if 0 /* example usage */
        char      * type  = (*idata->getMimeType)(idata);
        int         len   = (*idata->getContentLength)( idata);




        InputData * id    = newInputData( i);
        char *   d        = (*id->getData)( id);
#endif
        return 0;
}
In order to make use of the SGML parser, you will have to add some code to this routine in order to create an SGML object, configure it and then send the incoming data to the parser. Here is the Receive Handler with extra code for parsing incoming HTML:

int
processMyData ( data, idata)
        sc_stdcs_t* data;
        sc_idata* idata;
{
	extern InputData * newInputData();




	group0_t * group = (group0_t*)data->group;
	InputStream * i   = (*idata->getInputStream)( idata);

#if 0 /* example usage */
	char      * type  = (*idata->getMimeType)(idata);
	int         len   = (*idata->getContentLength)( idata);




	InputData * id    = newInputData( i);
	char *   d        = (*id->getData)( id);

#endif




 	SGML_t * sgm;
	if ( strcmp( type, "text/html") != 0)
		return -1;
	sgm = scRegisterHTML( type); /* the parser object */
	(void) scAddTagCallback(sgm, "A", ON_ENTRY, getanchor, 
			"a-call");
	(void) scAddAttrCallback(sgm,  "A", "HREF", 
			getlinkinfo, "href");
	(v

oid) scProcessSGML( sgm, i);




	return 0;
}
This routine specifies that getanchor should be called whenever the parser finds an anchor tag (<a>) and getlinkinfo should be called whenever the "href" attribute is found. These routines, written by yourself, should look like this:

int
getanchor( tag, attr, type, call_data, client_data)
	char * tag;
	char * attr;
	int    type;
	void * call_data;
	void * client_data;
{
	printf("anchor-start(%s)\n", client_data);
}




int
getlinkinfo( tag, attr, type, call_data, client_data)
	char * tag;
	char * attr;
	int    type;
	void * call_data;
	void * client_data;
{
	printf( "%s=%s\n", client_data, call_data);
}

Note - See "Going a Step Further" on page 545 for information on how to run an on-line Sun WorkShop Visual Replay script which makes use of the parser.

Practical Information for Using the Parser

To use the SGML parser, you will need to link with precompiled code. The sources are available in:

$VISUROOT/src/sgml

The license provisions mean that you are free to use them as you wish.

To begin with, it is easier to use the precompiled version which comes with Sun WorkShop Visual.

The SGML parser uses the following files and directories:

  1. $VISUROOT/lib. This contains an archive and a shared version of the SGML library. The make rules, generated by Sun WorkShop Visual into the Makefile, use libsgml.so, but you can link with libsgml.a if you prefer.
  2. $VISUROOT/src/sgml/hdrs/SGML.h. This is the include file necessary to use the parser engine API. is referenced in the Makefile if you set the "SGML/HTML parsing" toggle in the Smart Code Customize dialog.
  3. $VISUROOT/src/sgml/dtds. This is the directory containing the HTML 3.2 DTD and other related data files. The parser will need to find this, so you need to set the DTDDIR environment to:
$VISUROOT/src/sgml/dtds
Before compiling, you should make sure that:

  1. $VISUROOT/bin is in your PATH
  2. $VISUROOT/lib has been added to your library path environment variable (for example LD_LIBRARY_PATH for 32 bit applications and LD_LIBRARY_PATH64 for 64 bit applications.).



1 Standard Generalized Markup Language Users' Group (SGMLUG) SGML Parser Materials. Written by James Clark.


Previous Next Contents Generated Index Doc Set Home