Tuesday, March 27, 2007

Review Topic: IMS basics


I’ve had experience with IMS twice in my career. Once was from 1979-1981 with the Blue Cross / Blue Shield Combined Medicare Project in Dallas, where I designed a back end statistical reporting system that would have to harvest medical information from a claims database in IMS. I would code the requirements on “Pride-Logik” forms, and a DBA would design the IMS retrieval methods and control blocks.

Later, in 1998 I implemented a National Change of Address system which had to communicate with a mainframe customer management system name-address database then in IMS. Actually, the database was accessed through a meta-language provided by Computer Sciences Corporation, so we did not have to be very concerned with IMS blocks much, although I remember one time there was a controversy over how generic to make the access, and there were certain difficulties in getting the PSB’s to work in the test regions. In 1999, we would convert to DB2.

Back in the early to mid 1980s, IMS and CICS was very much the preferred mix that headhunters looked for in mainframe programmers (at least in Dallas). That would change gradually throughout the 90s.

Relatively few shops compared to the universe use IMS today. Therefore, sometimes companies will scour the country looking for an IMS guru when they need one, especially someone with IMS-DC, which in the early 1980s had been a credible competitor with CICS as a teleprocessing monitor.

Nevertheless, sometimes candidates who have IMS on their resumes might get a few general questions in telephone interviews, so here are a few concepts.

A DL/I program in batch is executed by JCL whose EXEC actually says something like PGM=DFSRRC00,PARM=’DLI,BMPR165M’ where BMPR165M is the actual COBOL program. The LINKAGE SECTION of the COBOL program contains the Program Control Block masks. The PROCEDURE DIVISION has an ENTRY to ‘DLITCBL’ and then later makes calls to ‘CBLTDLI’ with arguments that include the PCB mask, SSA (sequential search argument), and retrieval area.

A DBDGEN is setup by the DBA to describe the physical database. The PSBGEN sets up Program Specification Blocks that will live in a PSB Library. Each PSB has PCB’s, or Program Control Clocks, that typically name the hierarchal database segments to which the program is “sensitive”, and sensitivity can even be defined at the individual field level. Some typical DLI calls include GU (Get Unique), GHU (Get Hold Unique), GNP (Get Next Within Parent) and many others (DLET, ISRT, REPL). SSA’s can have command codes, that can accomplish things like path calls and establishing parentage.

A good textbook is still “IMS Programming Techniques” by Dan Kapp and Joseph F. Leben, Van Nostrand Reinhold, 1978.

I have an earlier posting about my experience with IMS here (Nov 2006).

Monday, March 26, 2007

Mainframe productivity tools and source management


Most mainframe jobs require job-ready skills in the normal utilities that go with a mainframe environment. Usually these include TSO/ISPF and Roscoe, which tends to be implemented as an even simpler source manipulation tool using fewer resources for most tasks. Here is a reference on ISPF, and that would include a review of the PF and PA keys.

By the mid 1980s, 3270-type terminals had extensive programmability features including play-keys that could fill in log-on information and set up screens. These seem passe today.

REXX (the Restructured Extended Executor), as an interpretive language, is used in some TSO shops but I have never needed it. Here is the wiki write-up. TPX is often used as a session control and is sometimes used to manage complicated interrelated test and production versions of entities on different platforms (this can present issues of maintaining.

All shops use source control management and security products on the mainframe. The major source control packages are Serena’s Changeman, and CA-Endeavor and CA-Librarian. These products guarantee that source code and load modules remain “in sync”, an important security issue (especially for internal security). Since the late 1980s, programmers typically have not had access to update production files, with access controlled by products like Top-Secret and RACF. Some programmers find this annoying (and believe that programmers should be bonded); others find it reassuring. Some database products, such as IDMS central version or Dun and Bradstreet’s Information Expert, can be difficult to secure completely with these products.

CC/Harvest provides similar source control in a client-server environment, and I had some training in it in 2001.

Computer Associates, in one recent press release, makes the amazing statement that about 75% of the world's business systems are still in COBOL, at this link. CA also has a link discussing constructive legacy migration to client-server here. In the 1990s, companies sometimes kept original legacy systems and replicated the data on a mid-tier for a common user interface after mergers, but direct connect technologies (available from most RDB vendors like DB2 but not necessarily part of Ansi SQL) may be making this less appealing now.

A job that I had with a small consulting company in 1989 actually used VM on a 4341, with an operating system and F-disk that resembled the operation of a DOD PC at the time. You could access SAS directly from VM, or jump to an MVS environment on a 4381.

Monday, March 19, 2007

A couple of algebra class puzzles


I've always wondered how you get from the usual definition of a hyperbola ((x-h)**2 / a**2) - (y-k) ** 2 / b **2 ) and variations (opening East-west or north-south -- see Wkipedia here) to the "gas law" like graph xy = k that has asymptotes along the x and y axes or parallel lines. There aren't a lot of sites that explain the derivation, but here is one.

Another teaser question in algebra has to do with inverse functions. (Wiki reference) f(x) and g(x) are inverses if f[g(x)] = g[f(x)] = x. (x, not 1). So what is the inverse of f(x) = x? It is still g(x) = x. It's not 1/x. If you plug that in, you get 1.

Sunday, March 18, 2007

CHKDSK on XP system with NTFS


Although CHKDSK would run on my Sony Viao if the power ever got cut off the next time I booted, it never did that on my Dell XP Home machine that I bought in 2003. About a year ago I bought a Belkin uninterruptable power supply to prevent power-outs, and that has worked. I had logged off the system normally Saturday morning. I've been getting all of the automatic Microsoft updates, and Saturday afternoon, when I booted up after returning from the Iraq demonstrations, I got a message "NFTS Volume C is dirty" (or words to that effect) on the blue screen, and a 40 minute CHKDSK that consisted of three steps (1) file verification (2) index verification (3) security descriptor verification. The process sat during phase (3) for almost fifteen minutes with no evidence of progress, but then suddenly finished. Only one bad file and one bad index was found (both were cleaned by Chkdsk), and this was an obscure .gif that could have been left by an infected website, although McCafee had not found anything.

Dirty disks ordinarily mean improper shutdown, or might mean hardware corruption or gradual failure, or might been viral or software corruption, which should be rare if a machine is properly protected.

Here is Microsoft's reference on FAT and NFTS:

Here are a couple of other discussions: dirty volumes, chkdsk user guide.

Internet sources suggest this is a common occurrence, and recommend scheduling CHKDSK once a month, to run at the next reboot. On larger machines, CHKDSK / f can run a long time, so it could be a problem if one has to go to work.

Friday, March 16, 2007

Mainframes: Get back to the basic stuff and know how do read abend dumps


I recall in a job interview with Legent back in 1989 being asked if I could solve mainframe program abends without AbendAID, the usual tool in production shops to help programmers solve production abends in batch. An important item in Abendaid is the next sequential instruction to be executed (the NSI), but not always available for some stranger abends or even some SOC4's.

It's good to be familiar with some of the parts of a COBOL dump. The TGT is the Task Global Table, which comprises a SAVEAREA, BL cells (matching the program DMAP), TEMP STORAGE, and BLL CELLS which apply to items in the Linkage Section of the DATA DIVISION. Also there are INDEX CELLS. These items might point to definitive information about a load module even if the definitive compile has somehow been lost through careless move procedures.

R13 has the address of the TGT. R14 has the address of the RETURN to your program from a CALLED program, and R15 has the address of the entry point of a called program.

A good reference for this sort of stuff is Edward A. Kelly, "An Invitation to MVS Using COBOL," all the way back to 1989, TAB publishing, 222 pages.

Wednesday, March 14, 2007

Some summary notes on IDMS/R


IDMS/R is a network-like database that was popular in the 1980s and 1990s on IBM-style mainframes. At one time it belonged to Cullinane, and then to ADR (Applied Data Research) in Dallas (actually at the confluence of the North Central Expressway and LBJ I-635; I was there a couple of times), which also owned Datacomm DB and DC (which was more of an inverted list system). A precursor to IDMS on the Univac 1108/1110 was DMS-1100, which had the same architecture as early as the early 1970s.

A database record in IDMS is an entity and all of its associated entities. An example could be an instance of a single baseball team (the Washington Nationals aka Montreal Expos) , all of its players, its home and away schedules, its statistics, salaries, stadium, etc.

Network relationships are expressed with a concept called a SET, which can owners and members and can be navigated sequentially. Many-to-many relations (as in mathematics) are not allowed. Indexed sets are important for retrieval by generic key, random, or sequential processing.

Sets can contain mandatory or optional members, which can be disconnected or connected. A practical example could include whether a player remains active on a 25-player baseball team roster after spring training.

Databases are described in a Schema, coded in DDL (data definition language) which describes record (layouts) and usually relates them to sets. Other concepts are Area and LOCATION MODE, which may be direct, calc (with a key), via, or with a set. A Schema emphasizes the physical layout of the database, whereas a subschema stresse logical views that may be needed by one application (say, reporting baseball statistics) or group of programs.

Later versions of IDMS (since the mid 1990s) allow SQL. But navigation of a set, one record at a time, is common in Batch or simpler online programs and is conceptually similar to a DB2 cursor. Here is a reference to SQL in IDMS.

IDMS verbs include OBTAIN, MODIFY, and STORE.

ISMS can be run with VSAM files (that can be defined and manipulated with IDCAMS) or with IDMS physical files.

There are a variety of usage modes: Retrieval, Protected retrieval, exclusive retrieval, update, protected update, exclusive update. Locks can be implicit or explicit, and may be shared or exclusive. You can code a KEEP on the OBTAIN statement , or use a KEEP statement, to maintain a set relationship or to maintain a locking level.

IDMS batch programs may be run through a central version or in local mode. It used to be difficult to get some security packages to work in CV mode.

Common "error" codes include 307 (end of set, as when a set is processed like a DB2 cursor or flat file), and 326 (record not found). Another important code is 0966, when you try to update an area controlled by the Central Version while running in local mode (would not happen if files are de-allocated).

The on line environment is called ADS/Online or ADSO. This provides a simplified 4th Generation language for defining screens and user transactions, and is considerably faster to develop than conventional CICS command level programming. Some important concepts are Premap processing, Mapout and Mapin, and Response processing.

A run-unit is very much like a CICS task.

Here is a good short-answer quiz on IDMS, at this link, on Geekinterview, which appears to have some tests in other areas, too.

Thursday, March 08, 2007

Review topic: DB2, binds


A mainframe program accessing DB2 will have associated with it a DBRM, or database request module, which describes the different accesses to the DB2 tables needed in the program. The bind procdure for the program validates your security authorization to use the DB2 commands and functions and defines the physical access methods. The bind attaches the program to a package, which can belong to a collection that in turn belongs to a plan. Or a bind can attach directly to a plan.

A plan can have several isolation levels: RR (Repeatable Read), RS (Read Stability), CS (cursor stability -- often the best choice), and UR (uncommitted read, OK for tables not updated often). The bind parameters can include ACQUIRE(ALLOCATE), ACQUIRE(USE), RELEASE(DEALLOCATE), RELEASE(COMMIT). Allocate-related parameters are recommended for batch; use or commit related are recommended for on-line.

Table lock modes are IS, IX, S, U, SIX, X (S means share, I means intent, X means exclusive); page and row locks may be S, U, or X. The bind parameters that affect these locs are RR (repeatable read), RS (read stability), CS (cursor stability), and UR (uncomitted read).

DB2 recognizes these kinds of joins: merge, nested, and hybrid. Also hash join. A heap table is a hashed index.

A DB2 trigger can execute DB2 "procedures" when certain conditions are encountered, such as overdrafting on a checking account.

DB2 can never update more than one primary key at a time.

A host variable is the COBOL field in which data will be put after a SELECT.

Note the CASCADE and RESTRICT options in the ON DELETE clause of a foreign key deftinition.

Here is a geekinterviews short answer quiz on mainframe DB2, at this link.

Monday, March 05, 2007

Is the mainframe back? Are benched programmers to be welcomed back?


Bob Weinstein 's column "Tech Watch" in the "Recruitment Times" of the March 5, 2007 The Washington Times, is titled "The Mainframe Is Back." Weinstein maintains that the IBM mainframe (complete with MVS to OS-390, JCL, TSO/ISPF, batch cycles, CICS, IMS, DB2, RACF, and the comfy old conservative computing culture in large companies surrounding it) is back. After the 2001 recession, 9/11 attacks and scandals, it took a real dip, but it never went away. Now, a lot of programmers with these skills are older (often in their 50s and even 60s like me) and many "dropped out" during the blood letting, where companies merged and consolidated data centers, or sent a lot of maintenance work, including night time production batch cycle support (fixing SOC7's, etc) overseas. Some work is coming back, and demand has been increasing in recent months. To stay in perspective, remember that Bob Weinstein had written a similar column in April 2002, predicting the return of the mainframe -- I remember the column in the Minneapolis papers, and outplacement companies then chuckled at the "ray of hope."

I have noticed this. Two years ago, most of the gigs required very specific skill sets, and favored professionals who had stuck with a narrow skill set to become "experts" rather than job hopping for new exposure, which had happened so much in the 90s. Particularly desired were in depth Medicaid MMIS -- but you needed five years minimum to be considered -- also, DB2 or IMS internals, and Case Tools, that seem to be used a lot with welfare and social services systems run by states. There was a Catch-22. Because generally IMS is a low-demand area, few people will stay in it, so a company in bad need of support scours the country or the world for narrow expertise.

In the 1980s IBM mainframe had a lock on major companies, with the prevailing shop culture being CICS as the teleprocessing monitor (with its whole world of Meta COBOL programming conventions, especially revolving around pseudo-conversational processing) and IMS as the database. (There was also a IMS-DC, sometimes seen. Also ADR Datacomm had a DB and DC, which were simpler, and which I learned at Chilton Credit Reporting in Dallas in the 1980s; I would not get back to CICS until 1990, starting with ALC Macro. DB2 and SQL started becoming significant in the late 1980s. The culture of large mainframe computing in financial institutions or insurance companies tends to emphasize huge numbers of transactions with history, and favors cursor or record or row processing instead of the set processing preferred on the Web.)

Since mid 2006 the gigs have tended to become broader and less demanding of very narrowly defined skill sets, and that suggests that companies would like some of us to come back. And just recently, companies seem to be more likely to ask for in-person interviews rather than phone screenings, which indicates that they believe that the need for employees to live in corporate apartments in new cities (under a W-2 rate or corp-to-corp) may be less critical than a couple years ago. (By the way, there are tax rules here -- check with an advisor, although many contracts have to be written as less thab 365 days. Brainbench test certifications can help reestablish a professional's credibility. (Look in the "Test Center" in the "Skills Center" on the left hand side of the log on page.) My certifications (now COBOL II, JCL, SQL) can be reached by linking from here (the transcript link is at the bottom).

Recruiters say that you don't forget COBOL and skills like TSO/ISPF. IBM mainframe JCL had a reputation of being difficult, but it still quickly became second nature for me. IBM is reported to be working on making it simpler, with abbreviated forms more like Unix (or perhaps the greatest mainframe JCL of all -- the obsolete Univac 1108 Exec 8 -- remember that? It was very logical! -- all the way back to the days of Nixon and Ford, and those days of benchmarks at the Univac facility on Pilot Knob Road -- or was it Yankee Doodle? -- in Eagan, MN, near St Paul. Those were the days, my friend)

We shall see.

Thursday, March 01, 2007

More on putting up a political theory database


On October 18, I sketched out a possible database to map out and compare political arguments and document the sources that these arguments come from. The link is here.

The idea is that there would be three main tables, reasonably normalized: (1) A table of argument statements, coded by topic (2) a table of historical incidents (3) a table of sources documenting the incidents and the arguments, and coded as to the journalistic credibility of the source. The tables could be linked with appropriate SQL queries, mostly inner joins.

I have been developing this database offline as a Microsoft Access application. In theory, it could be loaded to a Windows Server website, and a user with Access could map it to the appropriate drive and run it. (Once you create an application in access, when you open the mdb from Windows Explorer, the computer actually executes the application.) Or it could be loaded as an ASP on a windows server configured properly, in appropriate directories on that server with the proper permissions. You have to use the proper Access panels and connect to your server, set up appropriately to run the asp's to access the data. Users will have to have Access to use this kind of application.

A much more flexible approach is to develop an executable application to get and query the data from SQL Server. The best environment is probably Visual Studio .NET with C# the language of choice. Microsoft offers an Express version of this with a choice of the Web development, or the ADO database development packages, but not both at the same time. To do that, you need at a miniumum to purchase the Professional version and devlop an application environment that can be loaded or copied to your web server.

But accomplishing all of this would take all of my written materials and format it in a professional manner where it could form the foundation of a commercial educational service for exploring controversial issues. I'll keep everyone posted.