Saturday, September 24, 2005

 

Swatch Labeler

This is different. I'm working with a client on the other side of the country. I need to communicate colors unambiguously to them. I plan to send a PDF that has a number of elements that look like this:



This is a grouped object in my InDesign document consisting of a rectangle with another rectangle rotated by 30 degrees and pasted into the first. It's the stroke of the pasted rectangle that separates the top left of the first rectangle from the second. Then, there's a two paragraph text frame set to vertically justify which labels the two colors in the two rectangles.

I grew tired of manually updating the names of the colors in the text frame, so I wrote this script to do the job for me. All I have to do is apply swatches to the two rectangles and then run this script:
//DESCRIPTION: Color Swatch Labeller

if ((app.documents.length != 0) && (app.selection.length != 0)) {
 var myGroup = app.selection[0];
 while (myGroup.constructor.name != "Group") {
  myGroup = myGroup.parent;
  if (myGroup.constructor.name == "Application") { errorExit("Selection is not part of a group.") }
 }
 var myRect = myGroup.rectangles[0];
 var myPasted = myRect.rectangles[0];
 var myTF = myGroup.textFrames[0];
 myRE = new RegExp("\\D+(\\d+)\\D+(\\d+)\\D+(\\d+)\\D+(\\d)");
 var myString = myRect.fillColor.name;
 myString = myString.replace(myRE, "$1, $2, $3, $4");
 myTF.texts[0].paragraphs[0].characters.itemByRange(0,-2).contents = myString
 myOtherString = myPasted.fillColor.name;
 myTF.texts[0].paragraphs[1].contents = myOtherString;
} else {
 errorExit();
}

// +++++++ Functions Start Here +++++++++++++++++++++++

function errorExit(message) {
 if (arguments.length > 0) {
  if (app.version != 3) { beep() } // CS2 includes beep() function.
  alert(message);
 }
 exit(); // CS exits with a beep; CS2 exits silently.
}
Let's look closely at the main part of this script. The first while loop identifies the group of interest from the selection. Notice the error message that can be generated within the loop.

There is a potential problem when looking up the parental chain to try to find something: you might reach the top of the chain. But, there is no top as far as InDesign is concerned (I think this is a bug, myself; compare the parent of the top of the hierarchy in the File System object model: it is null, thereby eliminating infinite loops when searching up the parental chain). InDesign's application object is its own parent, so without the test:
  if (myGroup.constructor.name == "Application") { errorExit("Selection is not part of a group.") }
inside the loop, if the selection were not part of a group, the loop would spin endlessly.

OK, in testing that I realized that my script is slightly inadequate in that if you have the text tool active with some text inside the text frame, the script fails to find the group. I need to remedy this by adding the isPureText() method and using it, like this:
 var myGroup = app.selection[0];
 if (myGroup.isPureText()) { myGroup = myGroup.parentStory.textFrames[0].parent }
 while (myGroup.constructor.name != "Group") {
While the script would still have worked had I left off the final .parent, it saves a step. That parent is either a group or not. If not, we're going to get the error anyway. If it is a group, it's the group we're looking for, so we can go right to it.

I haven't bothered any more error checking. This script has a very narrow purpose and so I can just assume that the group I've found is of the right kind. Hence:
 var myRect = myGroup.rectangles[0];
 var myPasted = myRect.rectangles[0];
 var myTF = myGroup.textFrames[0];
I know that these must be right (unless I've lost my mind and I'm running this script on some other group).

Hello! Another use of RegExp. It took me a while to get that string right. I forgot to include the double backslashes. They're needed because the expression is inside a string, so for one of them to get through to my RegExp expression, I need to have two: the first "escapes" the second.

The purpose of the RegExp is to convert a swatch name in the form C=0 M=75 Y=30 K=60 to the form I need it for my caption in the first paragraph of the text frame. "\D+" tells RegExp to seek out any string of one or more non-digits while "\d+" seeks any string of one or more digits. The parentheses provide the mechanism for extracting substrings from the total found string. All I want are the numbers, so I've put the digit strings into parentheses. Then, in the replace() call a couple of lines later, I use $1, $2, etc. to refer to these substrings, and I insert the comma-space separators to get the string into the form I want it. Then it's easy to update the contents of the first paragraph (taking care to avoid the closing return) with the string.

Fortunately, PANTONE colors are named just the way I want them and the second paragraph doesn't have a return, so handling them is easy.

So why am I doing all this? So I can say to my client: I'm planning to use the CMYK mix indicated by the top left caption. It corresponds to the PANTONE (why does PANTONE insist that everybody always SHOUT their name?) color at bottom right. My client can then refer to the PANTONE swatch book and compare what's on screen with what's in the PANTONE book so they can get some sense of what the printed version will look like.

By the way, even I can see that PANTONE 1807 C is not the same as 0, 75, 30, 60. The capture I took was in an intermediate state.

Wednesday, September 21, 2005

 

Text Processing

Here's an unusual need. I have a story that uses a bunch of styles. Some of them need to have the name of the style prefixed to every instance of a paragraph that has that style name, so, for example, the paragraph in the style Clone: rather than saying:
TKS-1
instead should say:
Clone:   TKS-1
where the whitespace is a tab.

This is the kind of thing that doesn't come up very often, and in fact those style names didn't have colons on the end when I started this. The challenge was to find a way of indicating which styles should get the treatment. It was certainly quicker to edit their names and add the colons than to manually insert these labels.

Here's the script:
//DESCRIPTION: Prefix colonized styles

var myStyles = app.activeDocument.paragraphStyles.everyItem().name;
var myStory = app.selection[0].parentStory;
var appliedStyles = myStory.paragraphs.everyItem().appliedParagraphStyle;
for (var j = myStyles.length -1; j >= 0; j--) {
 if (myStyles[j].slice(-1) == ":") {
  var myStyle = app.activeDocument.paragraphStyles[j];
  for (var k = appliedStyles.length - 1; k >= 0; k--) {
   if (myStyles[j] == appliedStyles[k].name) {
    myStory.paragraphs[k].insertionPoints[0].contents = myStyles[j] + "\t";
   }
  }
 }
}

Sunday, September 18, 2005

 

Text Processing -- Regex Version of Symbol Replacement

This time with help from Shane Stanley, I have come up with the solution to the symbol tagging using Regex. Here's the new version of the function:
function insertSymTags(theText) {
 var myRE = new RegExp("κ|µ|α|β|δ|Δ|ζ|γ|ε|λ","g"); //Seeks any of Greek glyphs
 for (var j = theText.paragraphs.length - 1; j >= 0; j--) {
  var myText = theText.paragraphs[j].contents;
  var myNewText = myText.replace(myRE, "$&");
  if (myText != myNewText) {
   if (myNewText.slice(-1) == "\r") {
    theText.paragraphs[j].characters.itemByRange(0, -2).contents = myNewText.slice(0,-1);
   } else {
    theText.paragraphs[j].contents = myNewText;
   }
  }
 }
 return true
}
While I was at it, I also did a list clean up function using Regex:
function validateLists(theText) {
 var myStyles = theText.paragraphs.everyItem().appliedParagraphStyle;
 var myRE = new RegExp("^[0-9]*\\.*\\s*");
 for (var j = myStyles.length - 1; j >= 0; j--) {
  if (myStyles[j].name.indexOf("References") != -1) {
   var myText = theText.paragraphs[j].contents;
   var myLines = myText.split("\n");
   for (var k = myLines.length - 1; k >= 0; k--) {
    var myString = String(k + 1) + ". ";
    myLines[k] = myLines[k].replace(myRE, myString);
   }
   var myNewText = myLines.join("\n");
   if (myText != myNewText) {
    if (myNewText.slice(-1) == "\r") {
     theText.paragraphs[j].characters.itemByRange(0, -2).contents = myNewText.slice(0,-1);
    } else {
     theText.paragraphs[j].contents = myNewText;
    }
   }
  }
 }
 return true
}
Phew. Let's hope I can build on these early successes and start using Regex for more of my text processing scripts.

 

Text Processing -- Symbols revisited

I have belatedly realized that some of the Greek symbols might need to be superscripted or italicied (or both), so rather than treat them as an after-the-fact process as I had been, I need to integrate them into the html tags process. The way I choose to do that is to insert tags into the text.

I had hoped to do this with Regex, but my attempts failed and I don't have the time to pursue it any further right now. So instead, I used this approach:
function insertSymTags(theText) {
 var mySymbols = ["κ","µ","α","β","δ","Δ","ζ","γ","ε","λ"];
 for (var j= theText.paragraphs.length - 1; j >=0; j--) {
  var myText = theText.paragraphs[j].contents;
  var myNewText = myText;
  for (var k = mySymbols.length - 1; k >=0; k--) {
   myNewText = myNewText.split(mySymbols[k]).join("" + mySymbols[k] + "");
  }
  if (myText != myNewText) {
   if (myNewText.slice(-1) == "\r") {
    theText.paragraphs[j].characters.itemByRange(0, -2).contents = myNewText.slice(0,-1);
   } else {
    theText.paragraphs[j].contents = myNewText;
   }
  }
 } 
}
I confess that it feels like cheating to resort to split/join, but I don't have the time to fully research this right now.

 

Terrific Help from Dirk

I've had some terrific help from Dirk Becker; check out his web site. I'll be back later to analyze his script and in the process hopefully learn some more about RegEx and particularly how to use it in the context of JavaScript.

But working with his script has opened my eyes to the realization that I'm going about dealing with the Greek symbols the wrong way. Rather than wait until after processing the html tags, I need to retro-fit them with html-like tags so that too can be converted into character styles just like all the others. The reason for this is to allow for the likely possibility that some of them will be superscripted or subscripted and possibly italicized too.

I'll write more on this later today, but I wanted to remind myself immediately.

Saturday, September 17, 2005

 

Musing about RegExp

I think I understand the general principles of regular expressions. I also can see how powerful they can be. But it shouldn't be so hard to work out what the various calls do and how to use them. Rather than address this abstractly, let's look at the specific need.

With the text I'm processing there are many html tags calling for italics, superscript, etc. Because of the source for this data, it is OK to assume that the tags are well-formed, and I don't have to worry about nested tags either (although that assumption might need testing). My text is inside an InDesign story, so we have to get it out to an array of JavaScript strings upon which the RegExp searches will be performed.

But, the changes made as a result of said searches must be applied to the story in InDesign because the result will be to apply styling. So, I'm looking for a string like <i>some text</i> except that I have to allow for the case of the tags to be mixed. Seems to me that the simplest approach is to apply styling to the whole shebang and the after the fact use InDesign's Find/Change get rid of the styled tags.

So, the first thing we have to do is construct a RegExp object to seek out the strings that match our strings of interest and then apply it. As a test, I wrote this to operate on the last paragraph of the story only:
app.findPreferences = null;
app.changePreferences = null;
myDoc = app.activeDocument;
myStory = app.activeDocument.stories[0]
myText = myStory.paragraphs[-1].contents;
myStyle = myDoc.characterStyles.item("Italic");
myRE = new RegExp("<i>.*?</i>", "i");
myFind = myText.search(myRE);
while (myFind > 1) {
 myMatch = myText.match(myRE);
 myStory.paragraphs[-1].characters.itemByRange(myFind,myFind + myMatch[0].length - 1).appliedCharacterStyle = myStyle;
 myStory.paragraphs[-1].characters[myFind].contents = "*";
 myText = myStory.paragraphs[-1].contents;
 myFind = myText.search(myRE);
}
myStory.paragraphs[-1].search("*i>",false,false,"");
myStory.paragraphs[-1].search("</i>",false,false,"");
And a loud voice is screaming that there's got to be a better way! But at least it works.

 

Text Processing -- A Wrinkle means RegExp

I've run into a wrinkle that requires an excursion into the world of RegExp (regular expressions). The HTML codes in this text use a mix of cases. Some of the <br> tags are written <BR>. And other tags have the same inconsistency. So, the split("<br>.join("\n") approach doesn't work. While in this case I could just double-up with little penalty, I'm going to have to come up with a more general solution for the other tags.

At first glance, the solution to the immediate issue is easy. The RegExp we need looks like this:
var myRG = new RegExp("
", "gi");
The first string argument is what we're looking for, the g in the second parameter stands for global which means seek out every item, and the i indicates case insensitivity. So, it would seem that our cleanUpBreaks() function should look like this:
function cleanUpBreaks(theText) {
 // Work a paragraph at a time
 var myTexts = theText.paragraphs.everyItem().contents;
 var myRG = new RegExp("<br>", "gi");
 for (var j = myTexts.length - 1; j >=0; j--){
  var myText = myTexts[j];
  // Change break tags to forced new lines
  var myNewText = myText.replace(myRG, "\n");
  // Eliminate all space runs
  var myParts = myNewText.split(" ");
  while (myParts.length > 1) {
   myNewText = myParts.join(" ");
   myParts = myNewText .split(" ");
  }
  // Eliminate spaces on either side of forced new line
  myNewText = myNewText.split(" \n").join("\n").split("\n ").join("\n");
  // Write back if changed
  if (myText != myNewText) {
   theText.paragraphs[j].characters.itemByRange(0, -2).contents = myNewText.slice(0,-1);
  }
 }
 return true
}
But if you look a little closer, most of that splitting and joining that comes after the use of the RegExp can be bundled into the RegExp itself.

I say "most" because the changing of space runs to single spaces operates globally in the current version, but I did that only to make sure there was not more than one space before or after the inserted forced line break. So, let's forget about that for now (it's an unrelated function and so ought to be done elsewhere, if at all) and improve our regular expression to address any spaces surrounding the orignal break tag.
function cleanUpBreaks(theText) {
 // Work a paragraph at a time
 var myTexts = theText.paragraphs.everyItem().contents;
 var myRG = new RegExp(" *<br> *", "gi");
 for (var j = myTexts.length - 1; j >=0; j--){
  var myText = myTexts[j];
  // Change break tags to forced new lines
  var myNewText = myText.replace(myRG, "\n");
  // Write back if changed
  if (myText != myNewText) {
   theText.paragraphs[j].characters.itemByRange(0, -2).contents = myNewText.slice(0,-1);
  }
 }
 return true
}
And there you have it. The space-asterisk pair on either side of the <br> in the regular expression causes the replace command in the loop to seek out zero or more spaces followed by <br> followed by zero or more spaces.

Friday, September 16, 2005

 

Text Processing -- Pause to Reflect

As is often the case with anything other than a simple, single-purpose script, real experience with real data can lead to revelations that require some rethinking. As a result of what I've done so far, I've realized that I have rushed to set the formatting of those symbol characters too early. Apart from that, everything we've done so far has been with text that from paragraph to paragraph is uniformly formatted; until we start applying character styling (or local styling -- but there won't be any of that), we can process the text in JavaScript rather than in situ in the document. This has huge speed benefits not to mention the possibility of using grep (not that I expect to).

So, that means that I need to move the symbol characters call later. It also means that I need to separate the html processing some. At least the conversion of the <br> tag to a forced line break needs to be handled separately because that is used only in the references paragraphs to delineate the members of lists. This first data exposes the fact that some of these lists are not well formed, so I need to process them to make them well formed. This is best done before inserting any formatting at a more detailed level than the paragraph styles.

With these considerations in mind, the function to convert the breaks to forced line breaks looks like this. It would be interesting to compare this solution with one that used Find/Change within the document to see which is fastest.
function cleanUpBreaks(theText) {
 // Work a paragraph at a time
 var myTexts = theText.paragraphs.everyItem().contents;
 for (var j = myTexts.length - 1; j >=0; j--){
  var myText = myTexts[j];
  // Change break tags to forced new lines
  var myNewText = myText.split("
").join("\n");
  // Eliminate all space runs
  var myParts = myNewText.split(" ");
  while (myParts.length > 1) {
   myNewText = myParts.join(" ");
   myParts = myNewText .split(" ");
  }
  // Eliminate spaces on either side of forced new line
  myNewText = myNewText.split(" \n").join("\n").split("\n ").join("\n");
  // Write back if changed
  if (myText != myNewText) {
   theText.paragraphs[j].characters.itemByRange(0, -2).contents = myNewText.slice(0,-1);
  }
 }
 return true
}
This is all relatively easy to follow thanks to the comments. The while loop eliminates space runs by changing all double spaces to single spaces until there aren't any double spaces left.

The most difficult line in the script is the one that writes the changed paragraphs back into the document. The problem here is that if you simply write the whole new paragraph over the existing one, the paragraph styles get messed up because that information is held in the return at the end of the paragraph, and we're overwriting it. Hence the use of itemByRange() on the characters of the paragraph and the use of slice on the text in myNewText to leave the paragraph mark alone, thereby preserving the paragraph style. The different numbering schemes used by these two methods can catch you out. While itemByRange is inclusive of the character at the second index, slice() isn't.

 

Text Processing -- Preserve Symbols

This job I'm doing involves a lot of Greek alphabetic characters which I have dubbed "symbols" because originally the Symbol font was used to present them. While the text font I've selected for the job, Myriad Pro, has all the glyphs I need, they are designed (not unexpectedly) to match the general look and feel of the Myriad family. As a result, the lowercase kappa looks entirely too much like a small cap K and the lowercase alpha looks more like a lowercase A. So, I'm proposing to my client that we use Warnock Pro as the source of our Greek alphabet.

Now we come to an acid test. What will those characters look like here if I just copy and paste them from my unicode script. There's only one way to find out:
function preserveSymbols(theText) {
 try {
  //This will fail if the Symbol style does not exist already
  var myCharStyle = myDoc.characterStyles.item('Symbol');
  var mySymbols = ["κ","µ","α","β","δ","Δ","ζ","γ","ε","λ"]
  app.findPreferences = null;
  app.changePreferences = null;
  for (var j=0; mySymbols.length > j; j++) {
   theText.search(mySymbols[j],false,false,undefined,{},{appliedCharacterStyle:myCharStyle})
  }
  return true
 } catch (e) {
  myErr = myErr + "Document lacks a Symbol character style.\r";
  return false
 }
}
Looks good so far in this editing window and in the preview. I wonder if all browsers are suitably equipped. Let me know if you can't read them. Perhaps I could come up with HTML codes for them, except that I'm not sure I know what they're all called. Actually, it's only the seventh one that is a mystery to me. I picked this list up from a script I wrote the last time I did this catalog, which was about two years ago so it is no wonder I have forgotten some details.

Notice how this function joins in the myErr procedure to report that the Symbol character style is missing -- sure, the script could create it, but the way this job is supposed to work that style is supposed to be there; if it isn't, that reflects a serious problem, so better to report it than try to fix it and perhaps mask some other issue.

It occurs to me that I have done enough work on the main routine to take a look at that. Here's how it stands:
myLib = getProjectLibrary("B-Library.indl");
myDoc = app.activeDocument;
myErr = "";
var myStory = myDoc.stories[0];
if (myStory.tables.length > 0) {
 var myTable = myStory.tables[0];
 fixStyles(myLib.assets.item("B-Styles"))
 if (!applyParaStyles(myTable)) { errorExit(myErr) }
 burstTable(myTable);
}
trimText(myStory);
removeBlankParas(myStory);
if (!preserveSymbols(myStory)) { errorExit(myErr) }
Notice how I only convert the table if it's there. That's for testing purposes. It allows me to add functionality to the post-table text processing and continue incremental testing without having to revert to the original file every time.

So far so good. This new function did its job.

 

Processing Text -- Trim White Space

Let's face it, this is not something that happens often (at least, not at the start of paragraphs), and when it does, the chances of there being more than one space or tab is pretty low, and I can ignore all other kinds of white space (remember: this text came from Excel). So, let me try something straightforward and see how it does:
function trimText(theText) {
 // Trims spaces from the front and back of all paragraphs in theText
 // Trims tabs from the end
 // Start with the tail end of each paragraph
 var myContents = theText.paragraphs.everyItem().contents;
 for (var j = myContents.length - 1; j >= 0; j--) {
  var myContent = myContents[j];
  for (var k = myContent.length - 1; k >=0; k--) {
   var myChar = myContent.slice(k,k+1);
   if (myChar == "\r") {continue} // do nothing to return
   if ((myChar != " ") && (myChar != "\t")) {break};
   // If we get here, paragraph has one or more spaces or tabs "at end"
   // Thinks: for some kinds ot text tag is legit at end
   // But this doesn't apply to this project
   theText.paragraphs[j].characters[k].remove();
  }
 }
 // Now look at start of paragraphs; for safety refresh myContents
 var myContents = theText.paragraphs.everyItem().contents;
 for (var j = myContents.length - 1; j >= 0; j--) {
  var myContent = myContents[j];
   for (k = 0; myContent.length > k; k++) {
    myChar = myContent.slice(k,k+1);
    if (myChar == "\r") {break} // empty paragraph
   if ((myChar != " ") && (myChar != "\t")) {break};
   // If we get here, paragraph has one or more spaces at start
   // Tabs at start are not valid in this job. Could be in others.
   theText.paragraphs[j].characters[k].remove();
  }
 }
}
As I wrote this various thoughts occurred, some of which I wrote into the script as comments because I might be tempted to use this script on other text. The issue of whether or not a tab is legit at the start or end of a paragraph is an important consideration. I've certainly been in situations where they were valid. But for this job, they are not. So the script removes them (if any were found -- I inserted a couple just to be sure while testing).

Why, when looking at the end of the paragraph didn't I just start at the character before the trailing return? Why go to the trouble of detecting and then ignoring it? Because some paragraphs don't have trailing returns, notably the ones at the end of a text flow -- note that a text flow could be a cell in a table. While, for the vast majority of the cells in the table we're processing, this is not an issue because I'm not going to trim the white space until after the table has been converted to text, for the first row of the table, this is not a good strategy because we don't want paragraph style names that start or end in space or tab.

This means we have to revisit a function we thought was finished the other day and beef it up. Remember this:
 var myLim = myTable.columns.length;
 for (var j = 0; myLim > j; j++) {
  var myStyle = getParaStyle(theTable.cells[j].contents);
 }
Here, we passed off the untrimmed contents of the first row to be the names of our paragraph styles. It happens that the first cell had a space after its contents that shouldn't have been there, but since these things are invisible, one can hardly blame the data generator. So, now I have:
  trimText(theTable.cells[j].texts[0]);
added into the script immediately before the call to getParaStyle() so that the name passed to the paragraph style funtions is not encumbered with leading or trailing white space.

 

Processing Text -- Blank Paragraph Removal

At this point in the proceedings, we may have some blank lines in our story. They could be the result of empty cells, of cells that end in a paragraph return, or of cells that had blank paragraphs within their text. It doesn't really matter where they came from, they've got to go.

There are various techniques for doing this. My preferred approach is this:
function removeBlankParas(theText){
 var myLengths = theText.paragraphs.everyItem().length;
 for (var j = myLengths.length - 1; j >= 0; j--) {
  if (myLengths[j] == 1) {
   theText.paragraphs[j].remove();
  }
 }
}
This approach to processing the paragraphs of a story (or any range of text) seems to be very efficient. It cuts down on the number of interactions with InDesign very dramatically when compared to getting the length of each paragraph individually. It does take some discipline (in more complicated scenarios) to distinguish between that which is the contents of a JavaScript variable and that which is the contents of the text you're working on, but the pay-off is often great.

In this case, because the action being performed changes the number of paragraphs in the text as the loop proceeds, it is necessary to work from the back. For some other actions, proceeding from the front is needed.

I just tested the script, but it exposed another problem; one that has been lurking at the back of my mind for a while because one of the paragraph styles I created a few days ago had a name that ended with a space: some "empty" paragraphs actually have a whitespace character in them. Let's write a function for this. I can use it to cleanse the names of the paragraph styles created earlier in the process and I'll need to use it here before running removeBlankParas().

I'm going to assume that there is a one-to-one relationship between the character positions in the text and the characters in the strings in JavaScript variables. This is a good assumption as long as none of the the text is in certain Asian languages.

So, the framework looks like this:
function trimText(theText) {
 // Trims spaces from the front and back of all paragraphs in theText
 // Trims tabs from the end
 var myContents = theText.paragraphs.everyItem().contents;
 for (var j = myContents.length - 1; j >= 0; j--) {
 }
}
And the big question is just what to put into the loop. Which I'll have to put off until my next session.

 

Text Processing -- Table to Text

This should be easy. At some point in the proceedings, I need to convert the table I started with to text. Given the existence of the convertToText() method, it's a cakewalk. The nature of the table I'm working with is that each row represents all the data about a particular item (that will eventually appear in the catalog), so each cell represents a paragraph (or set of paragraphs) that need to be in sequential order starting at the top-left of the table (except the first row) and proceeding left to right, row by row until the end of the table.

The question is, when to do this in the workflow. We also need to perform a large number of fiddly text conversions: Greek alphabet into the character style Greek (I'm using Myriad Pro for the text which includes the Greek alphabet, but I don't like the look of those glyphs in this context; I'm going to propose to the client that we use Warnock Pro for these characters. We also need to interpret a bunch of html tags for italic, superscript, etc.

One of the advantages of using functions for this kind of thing is that it simplifies reordering the processing. My gut feel is that these kinds of text processing will proceed quicker on a long single story rather than on the large number of cells in a table -- the first table I'm working has 143 rows of 11 columns), but I'll do some timing experiments later to confirm this.

Indeed, it was easy:
function burstTable(theTable) {
 // First, remove the first row
 theTable.rows[0].remove();
 theTable.convertToText("\r","\r");
}

Thursday, September 15, 2005

 

Text Processing -- Applying Those Styles

We return to Tuesday's function. The one that looked like this when we left it
function applyParaStyles(theTable) {
 // Returns true if all OK,
 // else loads logs error to global variable myErr and returns false
 var myLim = myTable.columns.length;
 for (var j = 0; myLim > j; j++) {
  var myStyle = getParaStyle(theTable.cells[j].contents);
 }
}
It turns out that it only takes a couple more lines of code to complete this script (although there might be more needed later to deal with error conditions that I so far haven't thought of. After we get the style for a column, we can apply it to all the text in the column with a single command:
  theTable.columns[j].cells.everyItem().texts[0].appliedParagraphStyle = myStyle;
Notice that we applied the style not to the cells but to the texts[0] of the cells. Why? Because some of the cells could be overset and we still want to apply the style to the overset text.

Finally, we need to indicate to the calling routine that all is well by returning true.

 

Text Processing -- Fixing Styles

Because the intention of this script we're working on is to process an originally created document which has so far had nothing done to it except have a formatted table of data placed in it [thinks: I wonder if I could get by with unformatted; some of these table are huge so it's worth checking; all the "local formatting" is indicated by html tags, so we don't have to worry about that this time around -- that was not true last time). This still means that it has within it any default paragraph and character styles (also object styles, but I'm not using them, so they don't matter right now). What we need are the styles (more particularly, styles with the right names) that we created over the past couple of days. Remember: I saved them into a library item.

So, I wrote this function:
function fixStyles(theAsset) {
 // First, we clear out any existing paragraph or character styles
 var myStyles = myDoc.paragraphStyles; // myDoc is global
 try {
  for (var i = (myStyles.length -1); i > 0; i--) {
   myStyles[i].remove();
  }
 } catch (e) {} // CS2 can't delete [Basic Paragraph]
 myStyles = myDoc.characterStyles;
 for (i = (myStyles.length -1); i > 0; i--) {
  myStyles[i].remove();
 }
 app.select(null); // Make sure there is no selection
 var myTF = theAsset.placeAsset(myDoc)[0];
 myTF.remove();
}
This code is fairly self explanatory, but let me elaborate a little:
  1. The try construct when deleting paragraph styles is there because while InDesign CS allows all paragraph styles except [No Paragraph Style] to be deleted -- that's why the loop only goes down to item 1 -- CS2 also won't let you delete [Basic Paragraph]. So, when running this script with CS2, the last time around will create an error which is then ignored.
  2. For the character styles, we again avoid the first item because [No Character Style] also can't be deleted.
  3. Both these loops iterate backwards through the styles that's to avoid referring to an item that has already been removed -- by starting at the back end, this problem is avoided.
  4. Because of that, we don't actually need the myLim variable I usually create to control the loop because the first statement in the for loop control is only executed once; that's a good argument for always iterating backwards!
  5. The script makes sure there is no selection when placing the asset on the document because if there were a text selection at the time, that text would be obliterated and the reference to the placed asset wouldn't work.
  6. Because I just about never have multiple objects in the same library asset, I tend to forget that the results of placing an asset can lead to multiple items being added to the document. That's why I need to get the first item into the myTF reference so I can delete it after it has done its job (of bringing with it the paragraph and character styles that I need).
I expect to work more on this job later today.

Wednesday, September 14, 2005

 

More on Processing Text -- Finding a Library

I'm picking up where I left off yesterday. I took the results of yesterday's script and used it to create a text frame that contains a sample of each paragraph style, having redefined them to match what I want. So, the next step is to write the part of the script that actually assigns the paragraph styles to the text in the columns. We have to watch out for overset text when doing this.

But first, I need the code to go fetch the styles from the library. This is a pick-up from earlier work. It's a function I wrote that searches for a particular library by name, starting in the folder that holds the active document and working upwards until it either finds the library or reaches the top of the folder hierarchy. In the latter case, it gives the user the chance to open a library from anywhere; there follows some heavy duty checking to make sure that the selected file really is a library. For my purposes on this project, I'm always going to find the library, so I suppose I could create a shorter version, but what's the point?
function getProjectLibrary(libName) {
 var myDoc, myLib, myFolder, myLibFile, myLibName
 myDoc = app.activeDocument;
 try {
  myLib = app.libraries.item(libName);
  myLib.name;
  return myLib
 } catch (e) {
  // Library is not open; go find it and open it
 }
 try {
  myFolder = app.activeDocument.filePath;
 } catch (e) {
   myFolder = null;
 }
 while (myFolder != null) {
  if (File(myFolder.fsName + "/" + libName).exists) {
   app.open(File(myFolder.fsName + "/" + libName));
   return app.libraries.item(libName);
  } else {
   myFolder = myFolder.parent;
  }
 }
 while (true) {
  myLibFile = File.openDialog("Please locate a Library file to use");
  if (myLibFile == null) {
   // User canceled so bye-bye
   exit();
  }
  myLibName = myLibFile.name;
  // If name contains "Archive.indl" double-check with user
  if (myLibName.indexOf("Archive.indl") != -1) {
   if (!confirm("Selected library is an archive; continue anyway?")) {
    continue; // Ironic name for this command, given the question
   }
  }
  if (myLibName.indexOf(".indl") == -1) {
   if (!confirm("File does not have a '.indl' extension; continue anyway?")) {
    continue; // Ironic name for this command, given the question
   }
  }
  // If we get here, either the name is good or the user said to continue
  try {
   app.open(File(myLibFile.fsName));
   if (app.activeDocument != myDoc) {
    app.activeDocument.close(SaveOptions.no);
    throw("Sorry. The file you selected was not a library file.");
   }
  } catch(e) {
   throw("Sorry. The file you selected was not a library file.");
  }
  try { // Necessary in case user opened a book file
   myLib = app.libraries.item(libName);
   myLib.name;
   return myLib
  } catch (e) {
   throw("Sorry. The file you selected was not a library file.");
  }
 }
}
I'm not going to provide a lot of commentary about this function (I don't have time), but it does show-up one interesting difference between the file system object model and the InDesign object model: at the top of the file system hierarchy, if you try to get the parent of the top item, you get a null response while in the InDesign object model, the application is its own parent -- that means that in InDesign you always have to watch out for infinite loops when searching up the parental chain.

Tuesday, September 13, 2005

 

Processing Text

A new job (actually, a repeat of some work I did two years ago in AppleScript with InDesign 2, but with some new wrinkles) has come in. It requires me to write a suite of scripts to do serious processing to some text. The input is coming in the form of Excel spreadsheets, which I import into InDesign CS2 and the script takes it from there.

This will be a script that performs many functions sequentially. The ultimate goal would be to have the script take the input and convert it into a section of a catalog, but I'll probably fall short of achieving that. The first step is to use the entries in the first row of the table to determine which paragraph style should be applied to each column. The reason for doing this is that later we'll convert these tables back into text with each cell converted to a paragraph (or series of paragraphs if there's more than one in any cell). The paragraph styles will tell a later script (or later part of this script, depending on how things go) what to do with the paragraph in question.

To facilitate the script's growth as I add functionality, I'm going to use user-defined functions for each discrete part of the script. An advantage of this approach is that functions can implicitly make rules about the state of things when they're called, so they don't have to spend any time on assuring that the initial set-up is correct -- they just assume it and leave the calling routine (the main script or a higher function) to make sure things are just so. With that, here goes:
function applyParaStyles(theTable) {
 // Returns true if all OK,
 // else loads logs error to global variable myErr and returns false
}
While in a perfect world you might think that all variables should be local and if the function needs one it should be passed as an argument, for certain things its worth setting up some global variables. For the moment, I'm requiring that my main script set up three global variables: myDoc is a reference to the active document; myLib is a reference to the project library; and myErr is a string variable that is initialized to the empty string and which is used to log errors.

Project Library? Almost every job I do uses a library to some extent. The first thing we're going to use it for on this job is to hold the definitions of the paragraph and character styles we're going to need. Later, we'll need it to hold templates of the display elements we'll be using to populate our pages.

Why Log Errors? Eventually, this script will do a lot of processing. It will take a while to run and it will need to deal with out-of-spec data. Rather than just stop when that is detected, it will log an error and continue. That way, at the end of the run I'll have information about all that went wrong.

Thinks: It is distinctly possible that this script will crash InDesign. If that happens, I could lose my error log, so it might be better to log them to a text file. For now, I'll use a function to log errors so that when things start to become complex, I can change the strategy by simply replacing that function.
 var myLim = myTable.columns.length;
 for (var j = 0; myLim > j; j++) {
  var myStyle = getParaStyle(theTable.cells[j].contents);
 }
Here's the top level of the function. It sets up a loop to process each column. That means that the first thing we have to do is get the paragraph style that goes with each column. Notice that because this information is in the first row, I can simply look at the first myLim cells of the table because those are the cells of the first row. I'm taking advantage of the fact that these tables have no merged cells.

So, already, we need another function getParaStyle() to get the paragraph style that corresponds to the name in the each column head. For the moment, I'm going to write a temporary function that checks to see if there is a style with the name in question, and if not it will make one. Later, we'll be more discerning, insisting that only existing styles be used and logging an error if we don't recognize the name in the column head -- this is because my client has a number of people working on this job and not all of them use exactly the same terminology (at least, they didn't last time around).

Here's the first attempt at this function:
function getParaStyle(theName) {
 // Temporary function that either returns the paragraph style or makes one and returns it
 try {
  var theStyle = myDoc.paragraphStyles.item(theName);
  theStyle.name; // triggers error if theStyle is undefined
 } catch (e) {
  theStyle = myDoc.paragraphStyles.add({name:theName});
  try {
   theStyle.name;
  } catch (e) {
   errorExit("Couldn't make style with name: " + theName);
  }
 }
}
After all that stuff about error reporting, how come this function directly calls errorExit? Because this is a temporary function. If the paragraph style can't be made, there's no point in continuing.

Woo-hoo! It worked first time! Well, that's enough excitement for one evening.

Sunday, September 11, 2005

 

Real Work Interfering

As is often the case, I've been excrutiatingly busy the last few days and I've done just about no scripting during that time. However, this morning, I returned my attention to an old favorite, my Insert Date script.

This little script is one of the first I wrote when I originally switched to JavaScript from AppleScript back in the early days of InDesign CS. A few weeks ago, an improvement was requested by one of my friends on the BlueWorld list and I have finally gotten around to upgrading my own copy of the script to include the improvement.

The script is instructive in a couple of ways: it shows how to use the JavaScript Date object to get information about a date, and it also shows how to manipulate substrings within JavaScript. Here is the original version of the script:
today = new Date();
myDateString = today.toLocaleDateString();
app.selection[0].contents = myDateString;
Briefly, the first line creates a new date in a variable named today. When you create a new date object, if you don't set it to a value, then it defaults to the current date (i.e., the date when the script is run).

If you look at DevGuru's JavaScript Date Object page, you'll see that one of the methods available for date objects is the toLocateDateString() method I use in the second line to create the date string that corresponds to the locale where the script is being run. The third line simply inserts this string into the current selection.

This early version has a number of problems, some of which are already familiar: what if there is no selection? What if it's the wrong kind of selection. We know from previous scripts how to deal with those two: wrap the script in the selection framework and add a check that the current selection is pure text.

But there are also problems with the string itself. Run it today, and the result doesn't look too bad:
Sunday, September 11 2005
But had I run it two days ago, the result would have been
Friday, September 09 2005

Ugh! Who wants the leading zero in the number of the day? And shouldn't there be a comma after that number? Most people would think so.

There are many ways that these two problems could be solved. To solve the leading zero problem, I added this immediately after forming myDateString
myParts = myDateString.split(" 0");
if (myParts.length != 1) {
 myDateString = myParts[0] + " " + myParts[1];
}
The split() for strings breaks a string into an array of parts (hence my name for the variable I used) using the argument (in this case a space followed by a zero) as a separator. If that argument does not appear in the original string (as it doesn't today) then myParts is an array of length 1, in which case there is nothing to do, but on Friday, there would have been two parts, so I knit them back together using just a space as a separator.

It is worth noting that this piece of code is very much oriented to the specific string that I get here in New Jersey using the English language. In other parts of the world, in other languages, the string might look completely different, so this particular script might not work for you.

We still need to deal with the missing comma. For that, I use the slice() method.
myDateString = myDateString.slice(0,-5) + "," + myDateString.slice(-5);
As you can see, I use it in two forms, first with a second argument and then with just one. You can read all about the JavaScript String object and its methods on DevGuru's JavaScript String Object page. When looking at that page remember two things:
  1. String literals can be formed by simply assigning a string to a variable; such strings act very like string objects
  2. Many of the methods on that page relate displaying strings on web pages; they are not supported by InDesign's ExtendScript
Generally speaking, you do not need to use the String() method to create a string. Just pass a string to a variable. The most common time you'll find yourself using the String() method is when you are attempting to concatenate a number with a string. If the number is before the string, you must explicitly coerce the number to a string by writing something like
String(4) + " times a day"
to produce the string "4 times a day". This looks a bit silly, but if that 4 was contained in a variable it would make more sense
String(frequency) + " times a day"
So, with all the additions listed here, our script ends up looking like this:
//DESCRIPTION: Insert date at text selection

Object.prototype.isPureText = function() {
 switch(this.constructor.name){
  case "InsertionPoint":
  case "Character":
  case "Word":
  case "TextStyleRange":
  case "Line":
  case "Paragraph":
  case "TextColumn":
  case "Text":
   return true;
  default :
   return false;
 }
}

if ((app.documents.length != 0) && (app.selection.length != 0)) {
 if (!app.selection[0].isPureText()) { errorExit() }
 today = new Date();
 myDateString = today.toLocaleDateString();
 myParts = myDateString.split(" 0");
 if (myParts.length != 1) {
  myDateString = myParts[0] + " " + myParts[1];
 }
 myDateString = myDateString.slice(0,-5) + "," + myDateString.slice(-5);
 app.selection[0].contents = myDateString;
} else {
 errorExit();
}

// +++++++ Functions Start Here +++++++++++++++++++++++

function errorExit(message) {
 if (app.version != 3) { beep() } // CS2 includes beep() function.
 if (arguments.length > 0) {
  alert(message);
 }
 exit(); // CS exits with a beep; CS2 exits silently.
}

Thursday, September 08, 2005

 

Script of the Day -- Changing Case Again

A member of the BlueWorld list asked for a script to change the case of text in a particular style to lowercase. Completely forgetting that Shane Stanley had posted an AppleScript to do this a few weeks ago, I embarked on converting one of my scripts that makes use of a simple user dialog to do the job. Like many scripts, the doing of the job is relatively easy, but giving the user a way to say what he wants to do is a little more challenging.

In this case, I decided to let the user pick a paragraph style and choose which of the four possible case changes he wants to apply. Once more, the script is wrapped in the selection handling framework. Then a dialog with two drop-down menus is presented to the user. Finally, the real work is done in the small loop just before the functions.
//DESCRIPTION: Converts text in designated parastyle to designated case

if ((app.documents.length != 0) && (app.selection.length != 0)) {
 myDoc = app.activeDocument;
 myStyles = myDoc.paragraphStyles;
 myStringList = myStyles.everyItem().name;
 myCaseList = ["Uppercase","Lowercase", "Title case", "Sentence case"];
 myCases = [ChangecaseMode.uppercase, ChangecaseMode.lowercase, ChangecaseMode.titlecase, ChangecaseMode.sentencecase];

 var myDialog = app.dialogs.add({name:"Case Changer"})
 with(myDialog){
  with(dialogColumns.add()){
   with (dialogRows.add()) {
    with (dialogColumns.add()) {
     staticTexts.add({staticLabel:"Paragraph Style:"});
    }
    with (dialogColumns.add()) {
     myStyle = dropdowns.add({stringList:myStringList,selectedIndex:0,minWidth:133});
    }
   }
   with (dialogRows.add()) {
    with (dialogColumns.add()) {
     staticTexts.add({staticLabel:"Change Case to:"});
    }
    with (dialogColumns.add()) {
     myCase = dropdowns.add({stringList:myCaseList,selectedIndex:0,minWidth:133});
    }
   }
  }
 }
 var myResult = myDialog.show();
 if (myResult != true){
  // user clicked Cancel
  myDialog.destroy();
  errorExit();
 }
  theStyle = myStyle.selectedIndex;
  theCase = myCase.selectedIndex;
  myDialog.destroy();

  app.findPreferences = null;
  app.changePreferences = null;
  myFinds = myDoc.search('',false,false,undefined,{appliedParagraphStyle:myStyles[theStyle]});
  myLim = myFinds.length;
  for (var j=0; myLim > j; j++) {
   myFinds[j].texts[0].changecase(myCases[theCase]);
  }

} else {
 errorExit();
}

// +++++++ Functions Start Here +++++++++++++++++++++++

function errorExit(message) {
 if (arguments.length > 0) {
  if (app.version != 3) { beep() } // CS2 includes beep() function.
  alert(message);
 }
 exit(); // CS exits with a beep; CS2 exits silently.
}
I'm thinking that a variation of this script that let you choose character styles instead (or even one that let you choose one or the other) might be equally useful. And, I should probably integrate the early Smart Title Case script rather than lumber the user with InDesign's built-in dumb title case.

Tuesday, September 06, 2005

 

Musing about Objects

Everyone knows what an object is. A document is an object, a text frame is an object, a library is an object; indeed, just about anything you work with in an InDesign session is an object, including the application itself. But other things are equally not objects, for example, your default setting for horizontal measure units is not an object -- it's a property of an object. Palettes are not objects (although windows are). Palettes could be, it's just that palettes are not yet incorporated into the scripting interface.

Is it worth trying to come up with a generalized description of an object when they're all so different from each other?

I confess to mixed feelings about this. One part of my brain screams of course it is, without a formal definition of an object how can you work with them? The other part responds that it is easy. I know what a document is, so I know what I can do with one with my scripts. Similarly for just about any other object.

Indeed, there are some parts of the object model that I'm not particularly familiar with. I've managed without exploring the Table of Contents and Indexing related objects, I have only a minor acquaintance with the XML objects. My understanding of the color management objects is about on a par with my understanding of color management itself.

But these gaps in my knowledge hasn't stopped me creating a lot of useful scripts; I don't spend much time working in those areas represented by the gaps in my knowledge and so I've felt little or no pressure to explore them. Meanwhile, I'm forever creating scripts to help me massage text or adjust the relationships of graphics and their containers and the like.

Still, I have a grasp at a general description of an object. It is an entity (how's that for a nebulous word?) that fits somewhere in the tree of objects rooted in the application object. This tree has limbs that sprout in many directions. Some limbs are short: Application => Libraries => Assets. Some are long and theoretically boundless: Application => Documents => Spreads => Pages => Page Items => Page Items => Page Items ...

Page Items? There's a term you might not be familiar with if you have not scripted. Page Items is the name of an umbrella object type that includes all objects that can live directly on a page. So, rectangles, ovals, graphic lines, text frames and groups are all examples of page items. Graphics is a similar umbrella term for, you guessed it, graphics that can be imported into InDesign documents. Graphics live inside certain kinds of page items.

My sense is that an object is best thought of in these terms:

There are four kinds of objects in the InDesign Object model: the Root Object, Elemental Objects, Generic Objects, and Terminal Objects, to coin three names.

The Root Object

The root object is the application object. It is the base upon which all else builds. It has a very large number of properties, many of which are in fact collections of elemental objects or references to terminal objects.

Elemental Objects

This is the class of the really useful objects, the ones you actually work with: text frames, paragraphs, characters, insertion points, stories, rectangles, paths, pages, sections, libraries, documents, and on and on.

Generic Objects

These provide scripters with a mechanism for working with a variety of similar elemental objects that share a lot of common properties. Their existence is a convenience for scripters. Examples are: page items, graphics, text, and widgets (dialog objects).

Terminal Objects

This is my least favorite name for a class of objects. Perhaps I'll come up with something better. These are objects that have no children. They "terminate" their particular limbs of the object tree. Examples are the many settings, preferences and options objects that exist either as children of the application or of documents.

---------------------------------------------

So that's it for now. I can't say that I'm entirely happy with the label "Terminal" nor even with its definition. By that definition, an asset -- the scripting name for items in a library -- is a terminal object when it is clearly an elemental object. Now why do I say that? Because assets form a collection of elements of their respective library parents while the other kinds of terminal objects are not members of collections.

Well, I don't have to solve every issue at once. Clearly, I need to come back to this issue.

Monday, September 05, 2005

 

Script of the Day -- Proxy Aware Align Graphic in Frame

If you're in the situation where you find yourself importing images (or any kind of graphic, really, but for me it is almost always images) into a pre-existing frame, you are likely frustrated by the absence of any kind of alignment tools for aligning a graphic within its frame. If the frame was empty before, the image aligns at top left of the frame. If it previously contained a graphic, the incoming image aligns itself where the previous one was.

Sometimes (even often) these are good choices, but when they're not, you are faced with futzing about to get the image where you want it. It was this problem that caused me to conceive of this script. It allows you to select either the image or its frame (note that I limited this version to rectangular frames because so far that's all I've ever needed; it could be expanded to allow other shaped frames, and I will certainly do that should the need arise in my work). Before running the script, choose a proxy point on the Control panel or Transform palette and the script aligns the graphic at the point on the rectangle's border that corresponds to the point you chose. (I often use Fit Frame to Content immediately after running this script, but I decided not to build it in because there are times when I don't want that.)

I think this is the first script I've posted here that uses a switch statement.
//DESCRIPTION: Align image in rectangular frame based on proxy

if ((app.documents.length != 0) && (app.selection.length != 0)) {
 var mySel = app.selection[0];
 var myMsg = "Please select a graphic or its rectangular frame.";
 if (mySel.constructor.name != "Rectangle") {
  mySel = mySel.parent;
 }
 if (mySel.constructor.name != "Rectangle") {
  errorExit(myMsg);
 }
 if (mySel.graphics.length != 1) {
  errorExit(myMsg);
 }
This first part of the script checks to make sure that the selection is appropriate, i.e., a rectangle or its contents. If it isn't, the script exits with a helpful error message
 imgBounds = mySel.graphics[0].geometricBounds;
 myAnchor = app.layoutWindows[0].transformReferencePoint;
 frameBounds = mySel.geometricBounds;
These three lines collect the information we need to perform the alignment function: the graphic bounds in imgBounds, the currently active "proxy point" in myAnchor, and the rectangle's bounds in frameBounds.

Why did I make them global variables? Laziness. I'm going to fix that in the posted version because these variables don't need to be global and having them so might cause problems downstream if I incorporate this script into some larger script -- that's the main reason for being careful about using local variables in a simple script: one day, it might not be so simple or you might want to reuse it in a less simple situation.

Now we're ready to put the switch into operation. Depending on which proxy point is active, a different move is necessary -- it's all a matter of simple geometry. What's more, it doesn't matter what measurement units the user has activated.
 switch (myAnchor) {
  case AnchorPoint.topLeftAnchor :
   mySel.graphics[0].move([frameBounds[1],frameBounds[0]]);
   break;
  case AnchorPoint.topCenterAnchor :
   xLoc = (frameBounds[3] + frameBounds[1] -(imgBounds[3] - imgBounds[1]))/2;
   mySel.graphics[0].move([xLoc,frameBounds[0]]);
   break;
  case AnchorPoint.topRightAnchor :
   xLoc = frameBounds[3] - (imgBounds[3] - imgBounds[1]);
   mySel.graphics[0].move([xLoc,frameBounds[0]]);
   break;
  case AnchorPoint.leftCenterAnchor :
   yLoc = (frameBounds[2] + frameBounds[0] - (imgBounds[2] - imgBounds[0]))/2;
   mySel.graphics[0].move([frameBounds[1],yLoc]);
   break;
  case AnchorPoint.centerAnchor :
   xLoc = (frameBounds[3] + frameBounds[1] -(imgBounds[3] - imgBounds[1]))/2;
   yLoc = (frameBounds[2] + frameBounds[0] - (imgBounds[2] - imgBounds[0]))/2;
   mySel.graphics[0].move([xLoc,yLoc]);
   break;
  case AnchorPoint.rightCenterAnchor :
   xLoc = frameBounds[3] - (imgBounds[3] - imgBounds[1]);
   yLoc = (frameBounds[2] + frameBounds[0] - (imgBounds[2] - imgBounds[0]))/2;
   mySel.graphics[0].move([xLoc,yLoc]);
   break;
  case AnchorPoint.bottomLeftAnchor :
   yLoc = frameBounds[2] - (imgBounds[2] - imgBounds[0]);
   mySel.graphics[0].move([frameBounds[1],yLoc]);
   break;
  case AnchorPoint.bottomCenterAnchor :
   xLoc = (frameBounds[3] + frameBounds[1] -(imgBounds[3] - imgBounds[1]))/2;
   yLoc = frameBounds[2] - (imgBounds[2] - imgBounds[0]);
   mySel.graphics[0].move([xLoc,yLoc]);
   break;
  case AnchorPoint.bottomRightAnchor :
   xLoc = frameBounds[3] - (imgBounds[3] - imgBounds[1]);
   yLoc = frameBounds[2] - (imgBounds[2] - imgBounds[0]);
   mySel.graphics[0].move([xLoc,yLoc]);
   break;
 }
And that's the guts of the script. Note that if you have just placed a graphic in the previously empty rectangle then the top-left choice does nothing. Also, the center choice is already available on the Object menu: Fitting/Center Content.

The reset of the script is just clean-up and you'll recognize it as the tail end of my "selection framework" that I've used in a variety of scripts so far:
} else {
 errorExit();
}

// +++++++ Functions Start Here +++++++++++++++++++++++

function errorExit(message) {
 if (app.version != 3) { beep() } // CS2 includes beep() function.
 if (arguments.length > 0) {
  alert(message);
 }
 exit(); // CS exits with a beep; CS2 exits silently.
}
I can't believe it took me so long to get around to this script. I had become quite adept at copying and pasting values from one field to another in the Control panel!

Sunday, September 04, 2005

 

Fixing Figure References

My client assured me that figure references in a particular kind of feature box used in the chapters of the book I'm in the process of producing would always be in the form "Fig3-1" where "3" is the chapter number and "1" is the figure number. The figures themselves are in files named Fig3-1.psd or .tif or .pdf or even .eps.

So, I wrote a script to autmatically import the figures based on what I found in the relevant paragraphs. But then chapters started arriving with the references in this form:

Fig3-1: Description of figure

or:

<Fig3-1: Description of figure>

or:

<Figure 3-1: Description of figure>

So, rather than just take the contents of the paragraph and use that to build the file name, I had to filter the contents to get the name of file. Here's the function I'm currently using that deals with the above cases:
function stripName(theName) {
 theName = theName.split("Figure ").join("Fig");
 var theStart = theName.indexOf("Fig");
 var theEnd = theName.indexOf(":");
 if ((theStart == 0) && (theEnd == -1)) return theName;
 return theName.slice(theStart,theEnd)
}
Even as I prepared this note, it occurs to me that this function doesn't deal with the format:

<Fig3-1>

You can bet your life I'll be getting some of those soon enough. I guess the easy way to deal with that is to add a test to see if theEnd contains -1, and if so, do a search for ">". This should do it:
function stripName(theName) {
 theName = theName.split("Figure ").join("Fig");
 var theStart = theName.indexOf("Fig");
 var theEnd = theName.indexOf(":");
 if(theEnd == -1) { theEnd = theName.indexOf(">") }
 if ((theStart == 0) && (theEnd == -1)) return theName;
 return theName.slice(theStart,theEnd)
}
So, if the colon is absent, we check for a closing angle bracket, and if that's there use it for the end of the slice we take out of the text provided in the call. Of course, if the client manages to forget the colon but still includes the description, I'm going to end up with the wrong name, but that's not a disaster. The way I have the main routine structured, if it can't find the file, it puts the name of the (presumed) missing file into the frame instead of the image, so that missing files are not only easy to see on the page, I also know which one is missing.

 

Object Model

One of the reasons I started this blog was that every time I attempt to write about the object model, particularly the InDesign object model, I all too quickly become overwhelmed by the shear scope of the thing and I give up. But then, a few weeks later, I come back to it and start all over with yet another description of the Application object.

At least I start with the root object, the one that starts the whole ball rolling, rather than the AnchoredObjectDefault object, which happens to be alphabetically first on the list of all objects that InDesign knows about. But the problem with starting with the Application object is that it is (understandably) one of the most comprehensive objects there are in the whole model, so writing about it takes a long time compared to, for example, the Library object. Libraries are really rather simple objects on a lowly-populated branch of the model.

At one point, in my early days of JavaScripting, I even wrote a piece about the Library branch of the model, thinking that the lessons learned from that branch would serve as a basis for exploring the whole thing. The article is still posted on my website, here: Libraries Assets. It reflects my understanding of the issues around the time when InDesign CS first hit the street.

One of the problems that I run into is that of wanting to show examples of how to use various elements of the model, but it is rare that one can do that without drawing upon knowledge of other parts of the model. It happens that one of the jobs I'm working on right now is a text book for culinary arts students. I was struck by the fact that the book introduces sauté pans long before it touches on what it means to sauté food. Perhaps this is just one of the challenges of presenting complex information about interacting objects. The authors of the on-line description of the Ruby language say essentially the same thing in their introduction.

So, I am not alone grappling with these issues. This is the first of what I'm sure will be many items on this topic. Let us hope that I can quickly pass from philosophy to creating something useful (at which point, I'm not so sure that a blog is the best place to post, but perhaps I can use the blog as a place for public review of content).

Saturday, September 03, 2005

 

Researching Snippets vs. Libraries

I'm about to find myself embroiled in a catalog project. This is the second time around for me for this particular publication which involves presentation of a large number of complex objects that are similar to each other and yet subtly different. The last time around, I used InDesign 2, AppleScript, and a support library of object templates.

For this time around, I'll be using InDesign CS2, JavaScript and (I think) a folder of snippets (the choice is to once again use a library). Libraries have the advantage that you can directly place library assets inline to a story (although you do have to using live insertion points to do so).

OK, clearly, if I'm to use snippets, I need a script to place a snippet inline. So, let's take a look at that issue.
//DESCRIPTION: Test Script to place a snippet inline

var myFile = File.openDialog("Find a snippet:");
if (myFile == null) { exit() }
The first thing the script does is have the user find a snippet in the filing system. The File.openDialog() call returns null if the user cancels.

For now, we'll just assume that the active document has a currently active insertion point, so
var myDoc = app.activeDocument;
var myIP = app.selection[0];
So, now we can use myDoc to refer to the active document and myIP to refer to the current insertion point.

It appears that to place the snippet, I must use:
myDoc.place(myFile)
but this returns nothing, so how do I know what I just placed? It's time to actually try the script to see what happens; perhaps the placed item will be selected.

Nope. It's just sitting there on the page. So somehow I have to deduce what it is. Some experimentation suggests that after placing a snippet, myDoc.pageItems[-1] is a reference to it. While researching this, I noticed that myDoc.place() is new in CS2. In CS, you had to place on to a page. You can still do that in CS2, where according to the scripting reference:
myObj.myDoc.pages[0].place(myFile)
should give a reference to the placed object. Turns out that this is true for everything except a snippet, so there is no advantage to using that syntax.

OK, enough nosing around. Let's get on with the task of importing a snippet and making it inline. We have the insertionPoint in myIP and we know how to get a reference to the placed snippet, so it should be as easy as:
var myObj = myDoc.pageItems[0];
myObj.move(LocationOptions.after,myIP);
Oh misery me. That doesn't work. I knew that! You can't move a pageItem inline directly. You have to copy and paste. But I don't want to do that; I don't like using the clipboard for this kind of thing. It happens that there is an alternative approach using a library, but if I have to use a library in order to import my snippets, what's the point of using snippets when I can simply put the templates in the library and be done with it.

So, today's script doesn't work, but still, it has helped me make an important decision. I've also taken the chance to communicate this problem with snippets to appropriate people at Adobe.

Friday, September 02, 2005

 

Very Hectic Day

Today was very hectic. I wrote one special-purpose script, so let me share it here. I had created a map in the form of a group of elements in one InDesign document which was created from scratch for the purpose. Consequently, it used my application default setting for TextFramePreferences/Ignore Text Wrap, which is off. In the series of documents I'm working on, this option is set the other way, and so I overlooked the issue until I tried to use the map in my live document as part of a feature box that intrudes into the main story and so has text wrap turned on.

All the text frames in my map (country names, mainly) instantly became overset. Bah humbug!

So, I decided to write a quick script to solve the problem. Here's what I ended up with:
//DESCRIPTION: Set all text frames in selected group to Ignore Text Wrap

if ((app.documents.length != 0) && (app.selection.length != 0)) {
 var myFrames = app.selection[0].textFrames;
 var myLim = myFrames.length;
 for (j = 0; myLim > j; j++) {
  myFrames[j].textFramePreferences.ignoreWrap = true;
 }
} else {
 errorExit();
}

// +++++++ Functions Start Here +++++++++++++++++++++++

function errorExit(message) {
 if (arguments.length > 0) {
  if (app.version != 3) { beep() } // CS2 includes beep() function.
  alert(message);
 }
 exit(); // CS exits with a beep; CS2 exits silently.
}
If you've been following along, you'll recognize the framework here, so the guts of the script are just these lines:
 var myFrames = app.selection[0].textFrames;
 var myLim = myFrames.length;
 for (j = 0; myLim > j; j++) {
  myFrames[j].textFramePreferences.ignoreWrap = true;
 }
Like I said, a special purpose script. Consequently, it does next to no error checking, and simply assumes that the selection is a group (having established thanks to my framework that there is a selection). What's more, it makes the assumption that all the text frames are at the first level within the group (which I happened to know was true).

I confess that I was disappointed that I couldn't use the everyItem() construction here. Indeed, my first attempt at the script had just one line:
app.selection[0].textFrames.everyItem().textFramePreferences.ignoreWrap = true;
but this doesn't work, and I'm not entirely sure why not. Fodder for more research!

By the way, if you're an experienced JavaScripter, you might think my for statements a tad odd: why do I use "j" and not "i" like everybody else, and what's with the way I write the loop test?

Well, it happens that in the font I use for my scripting at the size I use it (Verdana at 10 points), the "i" is a tad hard to distinguish from the "l" so I just go with j, then k, then m, then n (if I ever get that deep).

And, as for the test, tradition would have me write j < myLim but if I do that, when it comes time to post the script on a forum (or even here), I'd have to edit all those < characters and replace them with &lt; which, bluntly, is more trouble than it's worth. So, I write my comparisons using ">" which doesn't seem to upset browsers (although some purists have argued rather heatedly with me about this). To me, myLim > j has become the natural construction.

Thursday, September 01, 2005

 

More on This Morning's Script

One of the first things I realized about the script I presented this morning is that it is not really necessary in that if the script didn't exist and I just used the menu item (via the shortcut), all I'd have to do is press the Command key and I'd be able to drag the text frame where I needed it.

But there's nothing like having the selection change to the frame to remind me of this. So I think I'm still going to use it even though I've just discovered a bug: the selection could be overset. So, my four lines of code are going to have to expand to address this issue. Actually, it still takes only four lines, but I go about selecting the text frame differently:
 if (app.selection[0].isPureText()) {
  app.select(app.selection[0].parentStory.textFrames[-1]);
 }
 app.selection[0].fit(FitOptions.frameToContent);
This approach has the advantage of also working in CS. However, there are some possible ramifications if you were to try to use this script to fit frame to content for a frame in the middle of a long story; you'd suddenly find yourself on a completely different page and the fit command wouldn't work.

For safety, I should only use this approach if I have first detected that the insertion point is overset. And, come to think of it, why doesn't fit frame to content work for a threaded frame. Let's fix that too.

Sounds like a script for another day. There are ramifications I need to think through, for example, how to determine if some text is overset?

 

Script of the Day -- Fit Frame to Content

It comes to me in a blinding flash that I can enhance the behavior of Fit Frame to Content when the selection is an insertion point or text. I believe that without exception, when I issue that command in that state, I want to end up with the pointer tool active and the frame selected. So, let's give it a go and see:
//DESCRIPTION: Alt form of Fit Frame to Content

Object.prototype.isPureText = function() {
 switch(this.constructor.name){
  case "InsertionPoint":
  case "Character":
  case "Word":
  case "TextStyleRange":
  case "Line":
  case "Paragraph":
  case "TextColumn":
  case "Text":
   return true;
  default :
   return false;
 }
}

// +++++++ Main Script Starts Here ++++++++++++++++++++++++++

if ((app.documents.length != 0) && (app.selection.length != 0)) {
 if (app.selection[0].isPureText()) {
  app.select(app.selection[0].parentTextFrames[0]);
 }
 app.selection[0].fit(FitOptions.frameToContent);
} else {
 errorExit();
}

// +++++++ Functions Start Here +++++++++++++++++++++++

function errorExit(message) {
 if (arguments.length > 0) {
  if (app.version != 3) { beep() } // CS2 includes beep() function.
  alert(message);
 }
 exit(); // CS exits with a beep; CS2 exits silently.
}
Of the 38 lines in this script, all but four came straight from my snippet library. The critical four lines are:
 if (app.selection[0].isPureText()) {
  app.select(app.selection[0].parentTextFrames[0]);
 }
 app.selection[0].fit(FitOptions.frameToContent);
This uses the isPureText() method to see if the selection is text (by "pure" text I mean "not a text frame" -- I have another method named isText() that I use if I want to allow text frames to be included in whatever the script is about to do).

In this case, if the selection is text, then the script simply selects its parent text frame (note the syntax I've used is CS2-only syntax; for CS, I'd have used simply .parentTextFrame; although that would not always work if the text spanned more than one text frame. You could argue that my CS2 code here also doesn't work if the text spans more than one frame, but that just doesn't happen in practice when I use this feature) and applies the fit command to it. This leaves the frame selected when it completes, which is exactly what I wanted.

On the other hand, if the selection is not text, then the command behaves just as it always did.

The next step is to save the script somewhere sensible -- I put it in my Text scripts subfolder and named it FitFrameToContent.jsx -- and then assign to it the Command-Option-C shortcut which was previously assigned to the menu item Fit Frame to Content on the Fitting submenu of the Object menu and away we go.

Serendipity

The script doesn't quite behave as I expected. It leaves the text tool active. But this means that I just hold down the Option key to move the text frame where I want it (nine times out of ten, when I need this feature for text frames, I'm constructing a caption for an image).

This page is powered by Blogger. Isn't yours?