Checking confluence with pythons

Heaven and hell.

Being a new starter at my new job, eager to absorb as much information as I can in as short a time span as possible, my attention immediately gravitated towards the Confluence Wiki with its plethora of pages (1800+) as a starting point.

There is a lot of information there, but the organisation of it could be better, so in line with the take-the-ball-and-run-with-it culture, I did.

First concern: how to improve things without making them worse?
Broken links suck.
Broken links I created, suck even more.

Sadly, and inexplicably, Confluence to this day does not include any tools to check for broken links, beyond the basics of “Orphaned pages” and “Links to new pages that haven’t been created yet”.

A quick search on the internet found a tool by an Atlassian employee that seemed promising: BustedStuffReport. Point it at a Wiki and it will scan all the pages and do some regex-magic to heuristically determine if all is in order.

Sadly: it only works on public Wikis, it does not follow any links to check them, it seems to target a somewhat older version of Confluence, it uses Python 2.

Having most of a solution already there, I decided I could hackimprove it to make it useful enough. Just let me get the Python language reference out and see what happens!?

After a week of playing around after work and in between tasks, I have a mostly rewritten script that is converging on the target I want to hit. I’ll post on Sciurus with the full details once I get there.

In the mean time, the experience has taught me the following:

  • The XML-RPC API to Confluence is very rich and regular, and reaches into almost all the corners I need (Yay!)
  • This XML-RPC API was deprecated in favour of REST,… while the REST API has not yet reached functional parity (D’oh!)
  • Confluence very helpfully does *some* classification of links through CSS classes… no idea why this isn’t visually represented by default (Huh?)
  • Python makes it incredibly easy to access the XML-RPC (Yay!)
  • Python still makes my skin crawl with its lack of type-safety, and no, I don’t want to write unit tests for a small tool like this (Boo!)
  • List comprehensions are awesome (just like LINQ is (double-Yay!))
  • Why do I need to end my conditional statement lines with a colon? I guess I can live with this, but for a language that strives for visual sparsity it seems like an odd requirement (*shrug*)

I think I can see why people love Python for scripts. But I’m still not convinced the productivity gained by the flexible typing system isn’t overshadowed by the extra test-cases you’d need to code in a non-trivial application. So, that leaves trivial for now, for me.

Day 237 – Just Like a Run

It was always going to be a full day today. Weeks with Melbourne travel in them are always a little more hectic than any others to begin with.

One of the last meetings of my day was with a counterpart in the business. I’ve extended the concept of employee one-on-ones out to the business as well. We are after all trying to meet business needs, and the best way to keep in-the-loop in that regard is to have regular conversations.

It’s always good to catch up.

When I got back to my desk there was a message indicating the Wiki system was having some issues. Quick click. No, no dice for me either.

Login to the VM. Restart service. No dice.

Upgrade OS. Upgrade Wiki. Restart service. No dice.

This was really not the last thing I wanted to deal with for the day. One hour before my gym class starts. Login, cancel, sulk.

Have you ever tried to read and comprehend a Tomcat/Java log file full of warnings?
No, neither had I. I had no idea what I was looking for.
I looked at it two times without much luck.

Third time was luckier though.
I noticed that the most verbose messages were detailing database connection issues. Surely not? Why now? What has changed?

To keep a long story short. It turned out that the database the system was relying on had never been properly noted. The database had been handed from one team to another. Other team was consolidating, and didn’t realise the database server was being used for anything else any more. Database server gets turned off.

A quick fix.

Alas, 15 minutes too late to get me to the gym in time.

Saturday will be my next exercise now.