Alfresco System Messages

In one of our custom Alfresco implementations we are making releases quite often due to somewhat heavy development. The customer used the site notice dashlet to inform users of upcoming downtimes but since there is a lot of sites with a lot of different users this was a quity cumbersome job to do before each release.

To aid the administrators we developed the alfresco-system-messages addon. Its implemented as a datalist using regular share forms when setting up an upcoming message. However, since this is something used system wide we decided to implement it as a Share Admin Console component, placing a datalistcontainer in the regular data dictionary of the Alfresco installation. Messages is time based and have different colours depending on priority.

all-pages-sm

admin-console-sm

This addon can be found here https://github.com/Redpill-Linpro/alfresco-systemmessages

Posted in Alfresco, datalists, share | 2 Comments

Tidying the alfresco workflow database

During an upgrade from Alfresco 4.2.5.1 to 4.2.6 for a client of ours we identified a problematic patch script which does some refactoring of data in the Activiti tables in the database. The customer system has been running for many years and the Activiti historic tables have grown large due to the fact that Alfresco never cleans these automatically. All workflow and task data is stored indefinitely in the database in the act_hi_*.

For this particular system the act_hi_detail table contained 2.1 million records (not that many for a database), however, the nasty SQL used to refactor data in the patch does not work well for a system of this size with that many workflows. The total number of Activiti processes (active and completed) were about 21000 and the patch script ran in our QA environment for many hours. Too many hours for us to have scheduled downtime, so we decided to cancel the upgrade script and find an alternate solution.

After some investigation and some trial and errors we found that when deleting a workflow using the workflow service it clears up the Activiti history tables. This can also be done from the UI per workflow after a workflow has completed. Then you have the option to delete the workflow from the workflow details page. This is nothing you do manually for 21k workflows so here comes the JavaScript console to the rescue!

var ctx = Packages.org.springframework.web.context.ContextLoader.getCurrentWebApplicationContext();
var log = Packages.org.apache.log4j.Logger.getLogger("RL_CANCEL_DELETED_WORKFLOWS");
var workflowService = ctx.getBean('WorkflowService');
logger.log("Starting cancel workflow script");
log.error("Starting cancel workflow script");

var completedWorkflows = workflowService.getCompletedWorkflows();
var limit = 5000;
logger.log("Limit is: "+limit);
log.error("Limit is: "+limit);
logger.log("Number of completed workflows: "+completedWorkflows.size());
log.error("Number of completed workflows: "+completedWorkflows.size());
if (completedWorkflows) {
	for (var i=0;i<completedWorkflows.size();i++) {
		var wf = completedWorkflows.get(i);
		if (wf.isActive()) {
			logger.log("Workflow is still active: "+wf.getId());
			log.error("Workflow is still active: "+wf.getId());
		} else {
			logger.log("Deleting workflow: "+wf.getId());
			log.error("Deleting workflow: "+wf.getId());
			workflowService.deleteWorkflow(wf.getId());
		}
		if (i>=limit) {
			break;
		}
	}
}
logger.log("Finished cancel workflow script");
log.error("Finished cancel workflow script");

This script will delete all completed workflows, both old jBPM workflows and Activiti workflows. The script will most likely time out when you run it so we added some log4j logging to it as well to get a log trail in our alfresco.log.

Since we are doing some workflow maintenance here we might as well delete all Active old jBPM workflows as well (in this case we knew for a fact that they will never be completed).

var ctx = Packages.org.springframework.web.context.ContextLoader.getCurrentWebApplicationContext();
var log = Packages.org.apache.log4j.Logger.getLogger("RL_DELETE_JBPM_WORKFLOWS");
var workflowService = ctx.getBean('WorkflowService');

var activeWorkflows = workflowService.getActiveWorkflows();
var limit = 2000;
logger.log("Limit is: "+limit);
log.error("Limit is: "+limit);
logger.log("Number of active workflows: "+activeWorkflows.size());
log.error("Number of active workflows: "+activeWorkflows.size());
if (activeWorkflows) {
	for (var i=0;i<activeWorkflows.size();i++) {
		var wf = activeWorkflows.get(i);
		if (!wf.isActive()) {
			logger.log("Workflow is not active: "+wf.getId());
			log.error("Workflow is not active: "+wf.getId());
		} else if (wf.getId().indexOf("activiti")===0) {
			logger.log("Activiti workflow: "+wf.getId());
			log.error("Activiti workflow: "+wf.getId());
		} else if (wf.getId().indexOf("jbpm")===0) {
			logger.log("Canceling jBPM workflow: "+wf.getId());
			log.error("Canceling jBPM workflow: "+wf.getId());
			workflowService.cancelWorkflow(wf.getId());
		} else {
			logger.log("Unknown workflow type"+wf.getId());
			log.error("Unknown workflow type"+wf.getId());
		}
		if (i>=limit) {
			break;
		}
	}
}

As a result of this maintenance job, we have about 1700 active activiti workflows and an act_hi_detail table with about 210000 rows (about 10% of the original count) and the patch went through in seconds.

Posted in Uncategorized | 1 Comment

Confused JVM? Kill it!

We have a customer where we’ve started to get memory problems in Alfresco recently. Those kind of problems can be very hard to pinpoint but for this particular client we’re almost sure what’s causing the memory problems. Unfortunately for us, this knowledge doesn’t prevent the memory problem. The bad thing with the JVM (in this case) is that even though a memory problem has occured, the JVM is left in a running state, although it’s not in a good running state… This particular Alfresco solution is clustered, and the memory problems ejects the server from Alfresco’s Hazelcast cluster but the LoadBalancer cluster still thinks the server is in the cluster which leads to a lot of problems down the road :(

Our customer have a very good organization around Alfresco, and if the JVM in which Alfresco lives should die when a memory problem occurs, it can be restarted in a breeze. In order to achieve this there is a JVM parameter (-XX:OnOutOfMemoryError) which can be used to execute a script when such an error occurs.

Below is how we solved this for our customer, a step-by-step instruction how to achive this. The server OS is Ubuntu 12.04.

  1. Install the package mailutils if not already installed.

    sudo apt-get install mailutils
  2. Create a shell script somewhere in your installation path.

    nano -w /opt/alfresco/current_version/bin/mail-and-kill.sh
  3. Paste this content into the script.

    #!/bin/bash
    
    # First argument  : process id
    # Second argument : server port
    # Third argument  : module (repo, solr or share)
    
    PROCESSID="$1"
    PORT="$2"
    MODULE="$3"
    FROM="noreply@example.com"
    TO="it-operations@example.com"
    SUBJECT="Tomcat shut down on $HOSTNAME:$PORT ($MODULE)"
    FILENAME="/tmp/mail-and-kill.txt"
    
    rm -f $FILENAME
    
    echo "Server  : $HOSTNAME" >> $FILENAME
    echo "Port    : $PORT" >> $FILENAME
    echo "Module  : $MODULE" >> $FILENAME
    echo "Message : Server got an java.lang.OutOfMemoryError and java process is killed" >> $FILENAME
    
    mail -a "From: $FROM" -s "$SUBJECT" $TO < $FILENAME
    
    kill -9 $PROCESSID
        

    The script takes three parameters. Process id (jvm process), port of Tomcat (HTTP port), and the module which caused the problem (repo, solr or share).

    In order for this to work, a local mail server has to be installed. For our client we’ve installed postfix which acts as a mail relay server.

  4. Add the following to the Tomcat startup parameters (for example setenv.sh).

        JAVA_OPTS="$JAVA_OPTS -XX:OnOutOfMemoryError='/opt/alfresco/current_version/bin/mail-and-kill.sh %p 8080 repo'"
        
  5. Restart Alfresco and force some nasty code to kill it :) and watch how you get a mail and the JVM is killed.
Posted in Uncategorized | 36 Comments

Posting custom events to the activity feed

A not so uncommon requirement from our customers is the need for custom activity messages to be posted when a user interacts with the system in some way. A recent example is to post a notification to the feed whenever a task in one of our custom workflows is completed. Alfresco really lacks documentation on how to implement this.

The use case is that we have a bunch of datalists, all with custom datalist types. For all those lists the user can initiate a custom workflow for every item in every list. We are talking about some 10 different types, with about 10 different custom workflows with a variety in complexity (that is a lot of transitions).

We wanted to come up with a general notification that could be used for every transition in every workflow, to avoid the need for like 50 different messages.

We came up with this activity message, which is placed in the resource bundles on both Repo (email feed) and Share (activity dashlet) side:
org.redpill.task.transition={1} completed workflow task “{2}” for {0} with the following outcome “{3}”

this will transform into something like the following when the variables are substituted by Alfresco during runtime:

demouser completed workflow task “review document” for Contact 1 with the following outcome “Rejected”

In the the 4.1.5-documentation I found the following which explains how the variables are substituted:

The email description is generated using the following format, for example, for a newly created blog post:

org.alfresco.blog.post-created={1} created blog post {0}

Where
0 = Item title / page link
1 = User profile link
2 = custom0
3 = custom1
4 = Site link
5 = second user profile link

To post to the activity feed we will use the following ActivityService.postActivity method:

/**
* Post a custom activity type
*
* @param activityType – required
* @param siteId – optional, if null will be stored as empty string
* @param appTool – optional, if null will be stored as empty string
* @param jsonActivityData – required
*/
public void postActivity(String activityType, String siteId, String appTool, String jsonActivityData);

Lets explain the parameters:
activityType – the message key “org.redpill.task.transition” which will map to the actual message in the feed.
siteId – if this message is related to a site the shortName of the site is provided, null if its a global activity. Since our workflows are executed in the context of a site we will provide the shortName of the site our datalistitem is present in.
appTool – a css-class used for visual formatting of the provided activity type in the gui. Alfresco have a set of classes for example “comment” and “calendar”. When supplied, a css class is applied which will render for example a custom icon in the activity message.

Tho add the css classes for our new activityType you will have to override the “activity-list.get.config.xml” and point out a css-class. In this example we will be reuse the Alfresco provided “comment”-class. You could of course implement your own css-class with a corresponding css-file as well.

jsonActivityData – now it starts to be a bit tricky, had wished this was better documented. To my current understanding, in the json you will have to provide “0 = Item title / page link” and “2 = custom0, 3 = custom1″ from the documentation above. If you provide the keys “title” and “page” with corresponding values in the json this will be rendered as a link substituting {0}. Then you have the possibility to provide up to two custom fields which we will do, the resulting json in our case will be:

{“lastName”:”Contractor”,
“title”:”4″,”page”:”lyse-datalist-details?workflowId=activiti$6677″,”firstName”:”Nils”,”taskName”:”Company Review”,”taskOutcome”:”approve”}

firstName and lastName is something the legacy-alfresco code have injected to the resulting json by looking up the fully authenticated user performing the execution. By default that will be rendered as a user profile linke replacing {1}.

Our call to the activityService.postActivity will look something like the following having the datalistItem nodeRef, and the task stuff as parameters to our method:

// Get the site shortName
SiteInfo site = siteService.getSite(nodeRef);
int id = (Integer) nodeService.getProperty(nodeRef, DatalistIDService.PROP_DATALISTITEM_ID);

QName type = nodeService.getType(nodeRef);
// construct a type label by QName type short name, by asking our static Map which holds the labels.
String name = typeNameMapper.map.get(type.getPrefixString());
JSONObject activity = new JSONObject();
if (id != -1) {

if (name != null){
name = name + ” ” + id;
}else{
name = “”+id;
}
// Is there an ongoing workflow, in that case link to our custom workflow page?
String workflowId = (String) nodeService.getProperty(nodeRef, CustomDatalistModel.PROP_WORKFLOW_ID);
if (StringUtils.isNotEmpty(workflowId)){
activity.put(“page”, “custom-datalist-details?workflowId=” + workflowId);
}else{
activity.put(“page”, “custom-datalist-details?nodeRef=” + nodeRef.toString());
}

activity.put(“title”, name);
activity.put(“taskName”, taskName);
activity.put(“taskOutcome”, taskOutcome);
}
activityService.postActivity(“org.redpill.task.transition”, site.getShortName(), “comment”, activity.toString());

So, how do Alfresco know what to do with “taskName” and “taskOutcome”, our custom parameters? Well this took a bit of digging into the code and to replace the two custom parameters with the values stored in the json we will need to override a method in the Share activity dashlet webscript. In activity-list.get.js there is function ‘specialize’ which formats the i18-message property. Add a case-statement for every custom message key which will not use the default rendering.

case “org.redpill.task.transition”:
item.custom0 = summary.taskName;
item.custom1 = msg.get(“workflowtask.outcome.” + summary.taskOutcome);
break;

Notice the workflowTask.outcome prepending the outcome message key. The explanation to this can be found in this blog post.

The activity feed is emailed to the users as well, thats why we need to duplicate the resource bundles to the Repo side as well. We will also have to modify the activities-email.ftl template as well, in short adding the same code logic as used in the activity-dashlet webscript. This could be done by modifying it directly in the dictionary or (like we do) by writing a patch that replaces the file in the dictionary when deploying our customization code. For inspiration you can have a look at the ActivitiesTemplatesUpdatePatch.java in the Alfresco source code tree on how to replace a ftl template.

Posted in Alfresco, share, Uncategorized, workflow | 14 Comments

Integration testing emails

Recently I received a task where I had to modify a function I’ve wrote a long time ago that send out emails to users based on some criteria. The problem is that when I wrote that function I:

1. Didn’t have that good understanding and knowledge of Alfresco
2. Alfresco’s email sending capabilities weren’t as good as today
3. It was for a 3.x version of Alfresco

I then decided to first write a test for the function. The thing is that the essential thing in the function is the actual subject, content and other properties of the mail. I also want this to work regardless of where the test is executed (dev computer, CI environment, etc). I googled some and found some rather nice frameworks for setting up a simple SMTP server in a junit test. I also found out that one of the more simple ones to use is actually included in Alfresco, GreenMail.

I promptly started to write my integration test. Define a private variable for the test first.

private GreenMail _mailServer;

In the “@Before” part (or similar) create and start the SMTP instance.

_mailServer = new GreenMail(ServerSetupTest.SMTP);
_mailServer.start();

In the “@After” part (or similar) stop the SMTP instance.

_mailServer.stop();

You also have to modify the alfresco-global.properties file for your test suite so that Alfresco sends emails to the correct port. As default GreenMail ServerSetupTest.SMTP picks 25 + an offset number (3000) as it’s port, so set the value to:

mail.port=3025

Now, in your test just execute the code that sends the emails. Now comes the nice stuff. GreenMail has a function to get all received messages. Getting the messages and doing the proper assertions is now a walk in the park :)

MimeMessage[] messages = _mailServer.getReceivedMessages();
assertEquals(1, messages.length);

assertThat(message.getSubject(), containsString("Urget message!"));
assertThat(message.getContent().toString(), containsString("Kind regards,"));

No more excuses not to test your email sending capabilities…

Posted in Uncategorized | Leave a comment

Search is no longer context aware in Alfresco 5

During testing an upgrade from Alfresco 4 to Alfresco 5 we noticed that searches are no longer context aware in Alfresco 5. Before Alfresco 5, if you made a search from within a site, then the default action was that results were also scoped to the site. This no longer happens in Alfresco 5. There is a Jira for this: https://issues.alfresco.com/jira/browse/MNT-13350, however the solution proposed in the comments wouldn’t work for me. The correct code to get the search to be context aware again would be to put the following code snippet in your extended share-header.get.js.

/** 
 Set default search scope to site:

https://issues.alfresco.com/jira/browse/MNT-13350

**/
function setDefaultSearchScopeToSite() {
  var searchMenuBase = widgetUtils.findObject(model.jsonModel.widgets, "id", "HEADER_SEARCH");
  //searchMenuBase.config.site = page.url.templateArgs.site; //Does not work
  searchMenuBase.config.defaultSearchScope = page.url.templateArgs.site;
}
setDefaultSearchScopeToSite();

Perhaps something changed in the Aikau framework after this issue was tested which introduced this regression error.

My reflection is, why isn’t the default behaviour of the search that it should be context aware? Isn’t that what people want? Or at least make this configurable in a simple way instead of requiring extending a web script controller.

There are a few other issues relating to this such as scoping the live search and advanced search: https://issues.alfresco.com/jira/browse/MNT-14006, https://issues.alfresco.com/jira/browse/MNT-11543

Posted in Alfresco, share, Solr, Uncategorized | 4 Comments

How to use form filters in workflow task forms

Had some problems the other day to get a form filter to execute. Found out that workflow task forms uses an own filterRegistry. So the spring bean definition have to be like this for example:

  <bean id="myTaskFormFilter" class="se.redpill.alfresco.repo.form.filter.MyTaskFormFilter" parent="baseFormFilter">
    <property name="filterRegistry" ref="taskFilterRegistry" />
  </bean>
Posted in Alfresco, workflow | 1 Comment

How to debug solr exceptions/warnings for Alfresco (Part 3)

Some good to know stuff having followed debugging which I thought I’d share.

First of all, to find out the node ref of a node id do the following SQL:

  SELECT CONCAT(store.protocol,'://',store.identifier,'/',node.uuid), node.id, node.audit_creator, node.audit_created, node.audit_modifier, node.audit_modified,
       node.audit_accessed
  FROM alf_node node
  LEFT JOIN alf_store as store ON store.id=store_id
  WHERE node.id in ('4777883', '4777625');

Next just use the node browser or Alfresco Share to find your node and its info.

If you are having problem with a node and want to exclude it from the index, have a look at http://docs.alfresco.com/4.2/concepts/admin-indexes.html. There it is described how to exclude a node from the Solr index. Or just disable content indexing, or metadata indexing.

Just apply the aspect and set the properties accordingly on a node and it will be indexed according to the rules described on the page. In my case I had to deactivate content indexing for some failing nodes due to corrupt content (but correct metadata).

function disableContentIndexingFor(nodeRef) {
  var node = search.findNode(nodeRef);

  node.properties["cm:isContentIndexed"] = false;  //Disable content index on the node
  node.properties["cm:isIndexed"] = true; //Allow metadata index on the node

  node.save();
}

disableContentIndexingFor("workspace://SpacesStore/cafebabe-cafe-babe-cafe-babecafebabe");

How to debug solr exceptions/warnings for Alfresco (Part 1)
How to debug solr exceptions/warnings for Alfresco (Part 2)

Posted in Uncategorized | Leave a comment

How to debug solr exceptions/warnings for Alfresco (Part 2)

In this post I dig a bit deeper in how you can debug solr exceptions/warnings. In part 1 of this blog series we looked briefly at the problem where we got a timeout and how we can fix this in the general case.

This post will show how to get more information about what could be the problem causing these errors.

We will look at an example where we get the following error in the solr log, however the way of debugging would be the same or similar for other kinds of errors:

2015-04-28 15:02:50,413 WARN  [org.alfresco.solr.tracker.CoreTracker] Node index failed and skipped for 8727989 in Tx 9223372036854775807
org.json.JSONException: Unterminated string at character 1866
        at org.json.JSONTokener.syntaxError(JSONTokener.java:413)
        at org.json.JSONTokener.nextString(JSONTokener.java:244)
        at org.json.JSONTokener.nextValue(JSONTokener.java:344)
        at org.json.JSONObject.(JSONObject.java:206)
        at org.json.JSONTokener.nextValue(JSONTokener.java:347)
        at org.json.JSONArray.(JSONArray.java:125)
        at org.json.JSONTokener.nextValue(JSONTokener.java:351)
        at org.json.JSONObject.(JSONObject.java:206)
        at org.alfresco.solr.client.SOLRAPIClient.getNodesMetaData(SOLRAPIClient.java:774)
        at org.alfresco.solr.tracker.CoreTracker.indexNode(CoreTracker.java:2376)
        at org.alfresco.solr.tracker.CoreTracker.reindexNodes(CoreTracker.java:1057)
        at org.alfresco.solr.tracker.CoreTracker.updateIndex(CoreTracker.java:566)
        at org.alfresco.solr.tracker.CoreTrackerJob.execute(CoreTrackerJob.java:45)
        at org.quartz.core.JobRunShell.run(JobRunShell.java:216)
        at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:563)

In this warning in the log we get no real information about what is causing the error, just that it concerns a node with id 6899573 in the transaction 9223372036854775807 and that it has something to do with parsing a json.

So the natural next step would be to fetch information about the node which we in this case will do by manually fetching the JSON the same way as Solr does.

If we look at SOLRAPIClient.java:getNodesMetaData it is obvious that Solr makes a request to api/solr/metadata to fetch metadata about a node. This will return a Json string with the metadata for an object.

We want to do the same for this particular failing node which means that we need to construct a Json request to http://localhost:8080/alfresco/service/api/solr/metadata.

By default the solr endpoints are protected by SSL. To make things easier to debug you can disable SSL on Solr by following the guide in the Alfresco Documentation: http://docs.alfresco.com/4.2/tasks/running-without-ssl.html.

I used the Chrome plugin Postman for performing this action, but curl or any other tool capable of making a REST Post request will do.

The request looks as follows:

POST http://localhost:8080/alfresco/service/api/solr/metadata
Content-Type: application/json
Body:
{
fromNodeId: 8727989,
toNodeId: 8727989,
maxResults: 1
}

And the response is:

{
    "nodes": [
        {
            "id": 8727989,
            "tenantDomain": "",
            "nodeRef": "workspace://SpacesStore/2777063a-1beb-41c6-bf96-9882ec740de6",
            "type": "custom:document",
            "aclId": 3570765,
            "txnId": 10801744,
            "properties": {
                "cm:modified": "2014-05-27T13:04:19.009Z",
                "cm:content": {
                    "contentId": "5285762",
                    "encoding": "UTF-8",
                    "locale": "sv_",
                    "mimetype": "application/pdf",
                    "size": "60978"
                },
                "sys:node-uuid": "2777063a-1beb-41c6-bf96-9882ec740de6",
                "cm:name": "Test \u0015 document.pdf",
                "sys:locale": "sv_",
                "sys:store-protocol": "workspace",
                "cm:autoVersion": "true",
                "sys:node-dbid": "8727989",
                "vgr:dc.source.origin": "Alfresco",
                "cm:initialVersion": "true",
                "cm:creator": "admin",
                "cm:created": "2014-05-27T13:04:19.009Z",
                "cm:title": [
                    {
                        "locale": "sv_",
                        "value": "Test \u0015 document.pdf"
                    }
                ],
                "cm:autoVersionOnUpdateProps": "true",
                "cm:modifier": "admin",
                "sys:store-identifier": "SpacesStore"
            },
            "aspects": [
                "cm:auditable",
                "sys:incomplete",
                "sys:referenceable",
                "cm:titled",
                "sys:localized",
                "cm:versionable"
            ],
            "paths": [
                {
                    "path": "/{http://www.alfresco.org/model/application/1.0}company_home/{http://www.alfresco.org/model/site/1.0}sites/{http://www.alfresco.org/model/content/1.0}test/{http://www.alfresco.org/model/content/1.0}documentLibrary/{http://www.alfresco.org/model/content/1.0}Test_x0020__x0015__x0020_document.pdf"
                }
            ],
            "ancestors": [
                "workspace://SpacesStore/627af57f-566f-4389-a19e-1f3aa71124dc",
                "workspace://SpacesStore/fae70774-af0e-4ca5-890e-cc2e19414d07",
                "workspace://SpacesStore/9f9e873e-4590-40ac-b8a5-54300938354f",
                "workspace://SpacesStore/dfcd625a-af5e-4eee-90c8-1c2333a8fac5",
                "workspace://SpacesStore/98f20817-27c1-4adb-acdb-bc825059d582",
                "workspace://SpacesStore/741b92bb-80a5-4fd5-b116-741523513776",
                "workspace://SpacesStore/403b9392-43ca-496a-94f9-581482f49fdb"
            ],
            "parentAssocs": [
                "workspace:\/\/SpacesStore\/627af57f-566f-4389-a19e-1f3aa71124dc|workspace:\/\/SpacesStore\/2777063a-1beb-41c6-bf96-9882ec740de6|{http:\/\/www.alfresco.org\/model\/content\/1.0}contains|{http:\/\/www.alfresco.org\/model\/content\/1.0}Test \x15 document.pdf|true|-1"
            ],
            "parentAssocsCrc": 1921797508,
            "owner": "admin"
        }
    ]
}

I used a Json validator (http://jsonformatter.curiousconcept.com/ and copy-pasted the json what the validation result as). It complained about an invalid character in the following part of the Json and what we can spot here is that in the document name an utf-8 character has not been correctly escaped.

"parentAssocs":[  
            "workspace:\/\/SpacesStore\/627af57f-566f-4389-a19e-1f3aa71124dc|workspace:\/\/SpacesStore\/2777063a-1beb-41c6-bf96-9882ec740de6|{http:\/\/www.alfresco.org\/model\/content\/1.0}contains|{http:\/\/www.alfresco.org\/model\/content\/1.0}Test \x15 document.pdf|true|-1"
         ],

Comparing the file name from the metadata we see that it should be “Test \u0015 document.pdf”, whereas in the parentAssocs part it says “Test \x15 document.pdf” which is not a valid escape string in Json. Further investingation showed that this is a bug in solr.lib.ftl which comes from that Freemarker does not translate unicode characters below 0×20. Quick fix, remove the special character from the filename (which probably should not be there anyway) and reindex the node. The issue has been reported to Alfresco Support and if it becomes a Jira issue I’ll link it in this article.

I made a patch to solr.lib.ftl which fixes this issue and when running the same REST call again and then we get a valid Json back from the metadata endpoint:

{
    "nodes": [
        {
            "id": 8727989,
            "tenantDomain": "",
            "nodeRef": "workspace://SpacesStore/2777063a-1beb-41c6-bf96-9882ec740de6",
            "type": "custom:document",
            "aclId": 3570765,
            "txnId": 10801744,
            "properties": {
                "cm:modified": "2014-05-27T13:04:19.009Z",
                "cm:content": {
                    "contentId": "5285762",
                    "encoding": "UTF-8",
                    "locale": "sv_",
                    "mimetype": "application/pdf",
                    "size": "60978"
                },
                "sys:node-uuid": "2777063a-1beb-41c6-bf96-9882ec740de6",
                "cm:name": "Test \u0015 document.pdf",
                "sys:locale": "sv_",
                "sys:store-protocol": "workspace",
                "cm:autoVersion": "true",
                "sys:node-dbid": "8727989",
                "vgr:dc.source.origin": "Alfresco",
                "cm:initialVersion": "true",
                "cm:creator": "admin",
                "cm:created": "2014-05-27T13:04:19.009Z",
                "cm:title": [
                    {
                        "locale": "sv_",
                        "value": "Test \u0015 document.pdf"
                    }
                ],
                "cm:autoVersionOnUpdateProps": "true",
                "cm:modifier": "admin",
                "sys:store-identifier": "SpacesStore"
            },
            "aspects": [
                "cm:auditable",
                "sys:incomplete",
                "sys:referenceable",
                "cm:titled",
                "sys:localized",
                "cm:versionable"
            ],
            "paths": [
                {
                    "path": "/{http://www.alfresco.org/model/application/1.0}company_home/{http://www.alfresco.org/model/site/1.0}sites/{http://www.alfresco.org/model/content/1.0}test{http://www.alfresco.org/model/content/1.0}documentLibrary/{http://www.alfresco.org/model/content/1.0}Test_x0020__x0015__x0020_document.pdf"
                }
            ],
            "ancestors": [
                "workspace://SpacesStore/627af57f-566f-4389-a19e-1f3aa71124dc",
                "workspace://SpacesStore/fae70774-af0e-4ca5-890e-cc2e19414d07",
                "workspace://SpacesStore/9f9e873e-4590-40ac-b8a5-54300938354f",
                "workspace://SpacesStore/dfcd625a-af5e-4eee-90c8-1c2333a8fac5",
                "workspace://SpacesStore/98f20817-27c1-4adb-acdb-bc825059d582",
                "workspace://SpacesStore/741b92bb-80a5-4fd5-b116-741523513776",
                "workspace://SpacesStore/403b9392-43ca-496a-94f9-581482f49fdb"
            ],
            "parentAssocs": [
                "workspace://SpacesStore/627af57f-566f-4389-a19e-1f3aa71124dc|workspace://SpacesStore/2777063a-1beb-41c6-bf96-9882ec740de6|{http://www.alfresco.org/model/content/1.0}contains|{http://www.alfresco.org/model/content/1.0}Test \u0015 document.pdf|true|-1"
            ],
            "parentAssocsCrc": 1921797508,
            "owner": "admin"
        }
    ]
}

The steps outlined here can be used on other kind of errors/warnings as well. Hopefully it will be of use to someone!

How to debug solr exceptions/warnings for Alfresco (Part 1)
How to debug solr exceptions/warnings for Alfresco (Part 3)

Posted in Alfresco, Solr | 1 Comment

How to debug solr exceptions/warnings for Alfresco (Part 1)

There are two kinds of solr error / warnings that I’ve seen a few times but never really investigated further. Now when in the process of migrating a big Alfresco installation from Lucene to Solr we had alot of errors and warnings that needed to be investigated and fixed.

Error/warning 1:

2015-04-28 16:11:35,532 WARN  [org.alfresco.solr.tracker.CoreTracker] Node index failed and skipped for 4777625 in Tx 9223372036854775807
java.net.SocketTimeoutException: Read timed out
       at java.net.SocketInputStream.socketRead0(Native Method)
        at java.net.SocketInputStream.read(SocketInputStream.java:152)
        at java.net.SocketInputStream.read(SocketInputStream.java:122)
        at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
        at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
        at org.apache.commons.httpclient.HttpParser.readRawLine(HttpParser.java:78)
        at org.apache.commons.httpclient.HttpParser.readLine(HttpParser.java:106)
        at org.apache.commons.httpclient.HttpConnection.readLine(HttpConnection.java:1116)
        at org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$HttpConnectionAdapter.readLine(MultiThreadedHttpConnectionManager.java:1413)
        at org.apache.commons.httpclient.HttpMethodBase.readStatusLine(HttpMethodBase.java:1973)
        at org.apache.commons.httpclient.HttpMethodBase.readResponse(HttpMethodBase.java:1735)
        at org.apache.commons.httpclient.HttpMethodBase.execute(HttpMethodBase.java:1098)
        at org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:398)
        at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171)
        at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
        at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323)
        at org.alfresco.httpclient.AbstractHttpClient.executeMethod(AbstractHttpClient.java:135)
        at org.alfresco.httpclient.AbstractHttpClient.sendRemoteRequest(AbstractHttpClient.java:111)
        at org.alfresco.httpclient.HttpClientFactory$DefaultHttpClient.sendRequest(HttpClientFactory.java:393)
        at org.alfresco.solr.client.SOLRAPIClient.getTextContent(SOLRAPIClient.java:992)
        at org.alfresco.solr.tracker.CoreTracker.addContentPropertyToDoc(CoreTracker.java:2980)
        at org.alfresco.solr.tracker.CoreTracker.indexNode(CoreTracker.java:2568)
        at org.alfresco.solr.tracker.CoreTracker.reindexNodes(CoreTracker.java:1057)
        at org.alfresco.solr.tracker.CoreTracker.updateIndex(CoreTracker.java:566)
        at org.alfresco.solr.tracker.CoreTrackerJob.execute(CoreTrackerJob.java:45)
        at org.quartz.core.JobRunShell.run(JobRunShell.java:216)
        at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:563)

Error/warning 2:

2015-04-28 15:02:50,413 WARN  [org.alfresco.solr.tracker.CoreTracker] Node index failed and skipped for 6899573 in Tx 9223372036854775807
org.json.JSONException: Unterminated string at character 1866
        at org.json.JSONTokener.syntaxError(JSONTokener.java:413)
        at org.json.JSONTokener.nextString(JSONTokener.java:244)
        at org.json.JSONTokener.nextValue(JSONTokener.java:344)
        at org.json.JSONObject.(JSONObject.java:206)
        at org.json.JSONTokener.nextValue(JSONTokener.java:347)
        at org.json.JSONArray.(JSONArray.java:125)
        at org.json.JSONTokener.nextValue(JSONTokener.java:351)
        at org.json.JSONObject.(JSONObject.java:206)
        at org.alfresco.solr.client.SOLRAPIClient.getNodesMetaData(SOLRAPIClient.java:774)
        at org.alfresco.solr.tracker.CoreTracker.indexNode(CoreTracker.java:2376)
        at org.alfresco.solr.tracker.CoreTracker.reindexNodes(CoreTracker.java:1057)
        at org.alfresco.solr.tracker.CoreTracker.updateIndex(CoreTracker.java:566)
        at org.alfresco.solr.tracker.CoreTrackerJob.execute(CoreTrackerJob.java:45)
        at org.quartz.core.JobRunShell.run(JobRunShell.java:216)
        at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:563)

This first part of the guide will cover the first error/warning.

This particular error comes from a timeout when Alfresco is trying to extract text content from the document. Possible causes could be a slow transformation or high load on the system. So the place to start would be to just increase the timeout in solr so that it waits around a bit longer for a response. This is done by modifying the SOLR_HOME/workspace-SpacesStore/conf/solrcore.properties and SOLR_HOME/archive-SpacesStore/conf/solrcore.properties.

The setting to look for is: alfresco.socketTimeout. Default value is 60 seconds (60000). Increase it to a level suitable for your installation such as alfresco.socketTimeout=240000 for 4 minutes timeout.

Some times increasing the timeout does not help and you see that Alfresco is working hard to serve the request. Then you might have to do with a corrupt document or a document which is too large to effectively export text content from.

Part 2 of this guide will contain information about the other error/warning. In it you will also get some tips on how to get more information about a failed index request.

How to debug solr exceptions/warnings for Alfresco (Part 2)
How to debug solr exceptions/warnings for Alfresco (Part 3)

Posted in Alfresco, Solr | 2 Comments