Translating documents

It has been a while since i’ve wrote here. So.. let’s relight the candle.

I have a Microsoft Word document. How can I translate it? Or just some words from it.

Well, you can look online for a freelancer or a company who provides translation services. But that costs money… and quite some time. Or you can ask your friend Moe to help. But… yeah…

If you are on a recent version of Office, you might have noticed a “Translate” button in the “Review” ribbon tab. This is a standard feature provided by Microsoft to help. But it doesn’t really.

01Now we have a better option, to use an Office (Word, Excel… etc) plugin called “SDL Language Cloud – Translation for Documents” (an incredibly long name, I know). Besides the fact that it actually works… let’s see how it can help us.


Install the plugin

Open MS Word and “Insert” -> Add-ins (Store) -> “SDL Language Cloud – Translation for Documents”. You will need an SDL Language Cloud subscription also (free trials available).

Start translating

What this plugin does is to use the automatic machine translation engines to send you complete translations back. Which you can edit afterwards to your liking, if needed.

translate selected textThis can be done for selected text or for the entire document at once (preserving formatting, pictures… etc).

Translating pieces of text

Select the text, choose the From and To languages (no autodetection) and decide if you want the translation to replace the text or come afterwards (so you can still see them somehow in parallel).

Translating the entire document

Switching to the option for entire document translation brings up a new possibility: usage of dictionaries (eg: terminology).

translate document optionBut first, it’s nice that what this does is create a new document in the destination language. It does seem to take a while, maybe several minutes.

About the dictionary, this is very useful if you have a document with some words you would like to be translated always in a certain manner (or finalnot translated at all.. such as brand names which are actually normal words) or company taglines. In order to do this one must use the Language Cloud online to define the dictionary and then use it in this plugin.

That’s it.

All in all… much better than the Microsoft Translate option in Office… which does not work for selected text and for the entire document it just opens it in a web view from where you can’t even download it.

Key things:

  • Works in regular Word application and in the Online version
  • Translating pieces of text is easy, “in place” or “appending” to your text
  • Translating documents preserves the formatting, pictures, tables etc
  • You can use dictionaries to make sure certain things are always translated as prescribed (terminology)

Disclaimer: I work for SDL and this is how I came to learn about this plugin.

A contest for software developers

We recently organized a software development challenge to give newcomers an opportunity to market their application to an enterprise cloud platform. And we did this exactly when the “app” ecosystem buzzed with the fact that enterprise apps offer a greater payback than consumer ones.

Each and every software developer dreams of hitting it big with their great idea. Which is great and perfectly understandable once you get to really know the psychology of us, coding geeks. Instead of dismissing that, let’s harness the energy and help such wannabe entrepreneurs give their idea a shot.

So we did. The idea was simple: you bring the app, we bring the infrastructure and… more importantly… the “go to market”. Sales, channel, marketing, enterprise insight, support… all what a small dev shop does not have.

Oh boy… what a ride! Speaking at dev conferences, powering up the community to talk about it, raid the university campuses around the country, talking one-to-one with potential participants…. And the entries started to come. Some funny. Some serious. Some completely off-course. Some spot-on.

As this is now closed, I have a few suggestions to early software development entrepreneurs:

Do it.
If you have an idea you think it’s worth it. Just do it. There are probably at least other 1000 people like you in the world with the same idea. If you act on it, then you probably already surpassed most of them. That should encourage you enough. On another side… think why there aren’t so many “doing it”: because it’s easier to only think about it than to move a muscle.Don’t “gold polish”! Build an MVP as soon as possible.

If you are not embarrassed by the first version of your product, you’ve launched too latesaid Raid Hoffman and it’s close to perfect.

Find the buyer.
SomebIf you build it they will comeody needs to pay for what you do. Why should they pay? Is what you’re doing worth their money? In their opinion, not yours. This is where things go awkward: you love what you do… but can’t honestly find somebody to pay for it. You’re an artist 🙂 ! We’ve seen this in our competition a lot. The perfect example was for an app which aimed to teach you how to survive a disaster and the first thing it said was “Don’t panic”. Really. Hitchhikers aside, if I open an App after a disaster, then for sure I’m not panicking… am I?

The answers to “who can pay for it” questions are actually very hard to write down. But simple mathematics and a bit of honest estimations would be close to enough to justify a first commercial release. Or the project inclusion to “academic experiments” box.

Publish it.
No matter how marvelous the idea is, and how exceptionally beautiful is executed, if you can’t get it in front of the customer in a convincing manner, you would need a miracle to sell it. It might happen, but it’s much safer to actually work on that.

Well now, my fellow software developers, I know this sounds so “high level”, but it won’t hurt to spend some time on these topics, as high level as they are. If I did that many years ago then probably I would have many more successful products now. And if this would be taught in school, then I would have had many more great submissions in my competition.

How is this linked to Enterprise Content Management?

It is. ECM needs innovation and this can come only from new entrepreneurs. It’s one of the emerging technology area (just look at the sprawl of content producing devices, imagine the sheer volume of such “big content”) and it’s old enough to have a massive legacy waiting for disruption. I just don’t want to see many good ideas go nowhere because their owners didn’t seriously consider all the three pillars above.

Recognizing that you don’t master it all and partnering to see it through. Now that’s a good strategy!

A psihology of cloud

There is one thing which makes cloud offerings appealing in general: these are impersonal social transactions.

Now, let me clarify this a bit: when somebody buys an IT software solution Small shops custom solutionsthe usual way of doing this is by directly talking with the software development company (or a system integrator, somebody doing IT consulting.. etc). This means the buyer and the seller interact in a social manner (even though usually strictly professional 😉 ) and work their way through the finalization of the “project”. A simpler case is for out of the box products but there are seldom such things in the business solutions space.

Cloud changes this. Not by itself, but it can really take off once people on all sides start taking advantage of it.

Everybody says cloud solutions are ‘simple”, “easy”… and we quickly think about this as being “feature”-simple/easy. This is a bit true but a major incentive is the fact that the buyer can obtain its reward without any emotional involvement. He does not need to actually “talk” to some other person, he doesn’t need to explain itself to somebody. He simply clicks. This is where the lure of cloud is.

Of course, this does not mean the traditional way of selling software solutions will die. It’s exactly like when you’re going to the mall/supermarket vs. to your local corner shop. Both will coexist, each of them adapting to the market demands.Supermarket browsing

Now, if any of this is true, we will see vendors adapt to fit this consumer psychology. Maybe it’s not such a big surprise that Amazon is so successful in cloud now… isn’t it?

We need wholesalers and retailers for software solutions. Cloud providers are going to do this and software vendors will transform to meet their demand. But just as in the current physical goods economy, in cloud you’ll find some brands and a certain uniform quality while we will also have local shops with “limited series” products.

This will drive standardization, because you can’t have a supermarket unless you do have many similar products (comparable by some metrics understandable by the customer). And here lies a very big challenge, since in the business solutions area there are currently no pervasive standards, formal on informal. I believe the vendor struggle to be present in cloud with their business solutions (something which we will see more and more in the next year(s), together with the consumer behavior  will generate these standards.

Clearly interesting times…

Content Management Solutions Interoperability

No, this is not about CMIS, although the topic can be linked together not just through an acronym.

infoexchangeThis is a bit inspired by the recent identity crisis manifested in the “Enterprise Content Management” space, by many esteemed professionals doing their best to define either “content” in itself or more nowadays replacing “Enterprise” with “Easy” within ECM. While these kind of discussions are good and provide at least food for thought, I would like to see how we can go forward.

After a period of effervescence in the mid 2000, looking around to see what kind of solutions and software we have for solving common business problems around content…. I don’t see a significant evolution.

Sure, there’s this “cloud” thing. But we’ve been doing SaaS for a much longer time. And there’s file sharing, that’s indeed cool.

What I do believe is missing today is a roadmap to have inter-operability at the business level. Everybody is tackling the “information silo” problem and, of course, this creates new silos based on brand new shining technology.

When Cloud was born, a lot of people said we need to have cloud inter-operability. To integrate one cloud provide with another. But most of the reasons behind this are technical and even if you related them to business needs, there are always more pressing itches which need to be scratched, so the money and time will go elsewhere.

What does the “business” need is actually a way to be able to scratch quickly and later on to exchange their information with other business areas/departments/systems/you get the idea. I don’t want to sound like a prophet, but I think we’ll see new and new information silos created everyday, much faster than any consolidation can occur. And we should not be afraid of this.

These silos need to inter-operate. This is the actual need. Technically, we have CMIS – for example. But not the technical issue is the main roadblock. It’s the data, with its meaning (context, as it can be referred to). How do you exchange data?

Picture this:

What if, every major ECM system (EMC Documentum, IBM FileNet, Microsoft SharePoint, OpenText XYZ, Alfresco etc.) will come preloaded with a set of predefined, extendable “schema” to exemplify how an “invoice” should be modeled? Or a “drawing”, or a “project plan”, or a “meeting minute”?

What if these schema would be built together by a consortium of all these people implementing solutions for customers (eg: Oasis)?

Imagine what this would do for enabling value driven inter-operability.

SharePoint and large files

Suppose you want to put SharePoint to some good use and you or your CIO thought “let’s just use SharePoint as a DMS”.

What can go wrong here? It can store files, users love it, we have plenty of licenses, integrates into Office, has search, versioning, metadata… all sorts of neat stuff.

This is completely true and SharePoint can definetely act as a very good DMS. If you know its limits and really understand that theoretical limits are a different beast in real life. As a side note, a DMS system is almost never just a neutral tool. It needs customization (or at least heavy configuration by a specialist) to implement business rules and processes. I’ll maybe touch on these on another post.

First, you will want to check the official Microsoft Sharepoint boundaries page here. Even in “small” DMS environments you will need to pay close attention to the “File size” limit and to the “Content database items” one.trextrying

So, you’ve found out that the maximmum file size is 2GB and that the recommended file size ranges between 50 and 250 MB. Chances are that you have many important files bigger than this. Jump to the internet and the best MSFT advice you’ll get is somehow sommarized in this TechNet post. Which basically says that you can’t. Or shouldn’t.

Don’t get me wrong. Technically, anything is possible in software and there are convoluted ways of making SharePoint handling files over 2GB in size. Such as splitting the file in volumes and storing those instead… and aggregating them on download… messy stuff.. Or use RBS for offloading the content database from SQL Server while you aim for the 4 TB limit (but still helpless for 2GB+). After which you are forced anyway to restructure your core DMS business solution logical architecture.

I, on another hand, have the opinion that a technology platform (eg SharePoint) should help me concentrate on business solutions (eg: solve DMS problems) not require advanced expertise just to make things kinda’ work and then keep an eye on the business requirements and solution to not break my pretty little technical-very-backend architecture limitations.

Back to the issue: can it be done in SharePoint?

Yes. But we need also another content platform to handle the exceptions.

What we can do is build a custom document library (or let’s say, a new type of document library with our additional features) which enable content transfer to and from the user using another content platform for storing the large files. This can work also to overcome the limitation of the number of items in a document library/content database.

Features can be implemented to create/destroy native SharePoint items as needed so that the user will be able to use SharePoint standard document library features on some selected documents if they really need to (provided that it’s not larger than 2GB, case when you are stuck with upload/download/stream only for the content.. no inline editing for you).

This way, the SharePoint limitations will not apply anymore (since the features are presented “at the glass” and interact directly with the other “hidden” content platform) but you still have the SharePoint features for selected items should you wish to bring them into the native space. Searches can be done a variety of ways. Either using SharePoint features (since it can index external stores) or the external CMS may have its own API for that.

Security is also ok, since you will choose a content platform which can really do item level security even for many or large files (in SharePoint you are advised to not use item level security if you have many documents in your library).

Linking the external CMS underneath a SharePoint document library gives you a lot of advantages… I’ll let you discover those, this is just a blog post.

One more hint: if your chosen content platform exposes itself as a CMIS provider, then you really hit the jackpot. Strategically speaking, because in real life SharePoint cannot act as a CMIS client anymore (since ShP 2013, although it could in 2010). But I think they will not be able to ignore this in the future and anyway you’ll find partners developing ShP addons to expose CMIS in the UI.

Here you have it. I just shared with you our solution on how to make SharePoint as a DMS when considering its large files limitation (and also works for the “many files” limitations).

Want to deliver ECM solutions in cloud?

While I’m no expert on the topic, I’ve spent a lot of my last years doing just that within our company – which is not a pure player in the space. Here are some ponderings which should be useful both for providers and for customers.The Cloud - from xkcd -

If you’ve not been living under a rock for the last years, you probably noticed the hype called “cloud” (not to mention “big data”, let;’s just leave it to that for a moment). As many pointed out, “cloud” has many meanings and while some are new, a lot of work had already been done  under different not so cool names like PaaS, SaaS (why not AaaS?), IaaS.. etc. Infinite elasticity, scalability and “always on” are subliminal promises vendors lure customers with. Some deliver this better than other. Some customers need it, some not. Well… it’s a free market after all.

Say you are a solution provider (integrator, services, whatever) and want to jump on this bandwagon. You call somebody in the company and say “let’s do this!”. If you’re thorough enough, you’ll make a business plan (at least). Please consider this:

1. You can’t just take on-premise software solutions and make them run in “cloud”.

Especially ECM stuff. If that solution is a good onw, it’s probably very complex. Complexity doesn’t play nice with reusability. Cloud is about reusing as much as possible so you can make economies of scale. If you don’t care about that (eg: you expect less than 20 customers) then consider the people administering it. They will be your staff? If so, do they need to learn a lot of specific characteristics for each of your customers? Hmm….

I’m sure all your enteprise customers will use the same software application, just configured differently. Aren’t they? You’ll educate them? Or you will be the next Consider it. Of course, you can isolate them and basically have an individual instance with all the peculiar characteristics customized for each customer. I sure hope they will be paying you the big buck, because it will be fun to keep that updated in time.

Or you will take the enterprise solutions you have and scale them down to fit a more “general”/small business need. It might work. Remeber: keep it veeeery simple to use. Very. For example; Metadata? You should innovate here. Checkout/checkin? Find another way. Your 20+ action menu? Trim it down to 5. Can’t?

2. Infrastructure is not scalable to the infinity

Put a virtualization layer on top, it eases things up. But you thought of that already. Reality check #1: virtual stuff runs slower than barebone hardware. Reality check #2: top grade infrastructure is expensive. Of course you can try the Google approach and build a lot using commodity hw. But you need special management tools and procedures fo do it. And maybe even rethink your software solutions. Goto no. 1.

Some ECM solution components might not even be compatible with virtualization. Hehehe. Are they yours or do you purchase/oem/integrate from other vendors? What;s their plan?

3. Business model si a different beast

How do you sell it? With the same sales people? How do you incetivize them? I’m sure you are aware that selling cloud stuff brings a tiny-tiny fraction of the initial revenue vs. selling one-off licenses & implementation services. When does your salesperson get bonused? Will this work for them or they will continue to push for the classic sale even if you have cloud to offer?

On the good news side, if you sell in the ECM space you should have already been thinking  about this for at least 1-2 years now… why? The days of multi-million solution sale are gone. The majority of customers are now planning for smaller initial costs and to have services and extensions spread over multiple years. With the ocassional exceptions, but if you’re counting on those, then you’re reading the wrong post.

What about licenses ? Your ECM solution will most likely sum up an impressiove number of third party licensing. Are those vendors providing a clear and adequate licensing model for the cloud? Do you change your user per GB stored? If so, you’re storage vendor is charging you based on the monthly reported usage? Eh?

4. Your internal processes and resources

Once you go live, you need to run it. If you have only a few customers, it’s probably not very much different than what you did until now. Especially if you already provided some sort of SaaS/PaaS before. But this is not the idea… “Cloud” should mean many more customers for you as it means more resources for your customers. Cheaper, for both. So, you will need to redesign internal processes (and tools) to be more effective. When you sell at 5 USD/user/month you need to take inspiration from Henry Ford. Considering you will still keep you current business model, beware how you mix people and resources around.

Support for an ECM solution is complicated. Will you take calls from all the end-users? Are your people capable of understanding this for all those many “cloud” customers you will hopefully have? Or they will just create frustration which will generate a lot of comments in the area of “why did we externalized this? it was much better when our (then thought lousy) IT handled it”. Remember, you’re changing a way of working, you’re not providing a solution to a greenfield area. What works for a telecom company providing commodity services and products will not apply in the ECM space.

The list can go on…

Is it all bad?

No way! Just look around. Although I called cloud “hype”, it passed that stage. Vendors are started to mature and many have realized the above items and are at avrious stages to address them.

This post is for you, our customers! Take care when choosing your cloud provider(s). I really do think you need one (the chances are you probably need cloud vs. you don’t). Do not evaluate in the same way Amazon with Box with EMC or with a specialized solution provider.They are different, at least for the reasons stated above. Work with them in understanding how did they solve the challenges, see their actual experiences and be cautious of “i know how to do it but we have no customers yet” offerings..

EMC Momentum 2012 – some technical bits

Today it was a day to dive into more technical details on the new D7 platform and mingle with other partners.

I attended 2 absolutely great sessions: Rohit Ghai “Transformation: In Action” keynote and the always great one done by Jeroen on Documentum Architecture. I missed the performance session of Ed Bueche… there is so many you can fit in a tight schedule.

Yesterday I was quite disappointed on the “monotoneous” tone of Rick starting keynote. Today, Rohit delivered a completely different show. Very good topics presented in a vivid atmosphere and rhetoric.

The talk nicely presented through clear examples the links from business challenges to the technology. Also, demo’s were done and showcased key integrations between EMC IIG’s new products. 

If you care about Enterprise Content Management (not only of EMC), look at the keynote recording here.

From Jeroen’s session there are so many things to take… here’s a high level list:

D7 and beyond : performance. massive improvements on the session and related memory management. Really matters. I think it’s a reason to plan for an upgrade in itself. NGIS is developed in parallel (more below).

xPlore 1.3: many updates, too technical for a high level list here. It is a key component of D7. New features for content processing as well as for manageability (command line tools etc). Has it’s own release cycle.

REST: the new API to use for the platform. All components will be exposed through REST. XML, Json, AtomPub… My take on this: DFS and DFC are on the way out. The session included a live “demo” on navigating with REST (json) alongside with create/search objects. Pagination included. Impressive and very very useful. Supports resource mobility (HTTP redirects) – nice! Look out for

Line of Sight – monitoring your virtual deployment of Documentum 7 though the usage of common tools existing in the vmWare product portfolio. Useful for OnDemand and similar deployments.

D2 and xCP: the usual stuff… (there is anough about these already written). One interesting thing: D2 is the client NGIS is currently being tested with, it seems to be the first client to work with NGIS (not xCP).

Captiva 7 – automatic classification and information extraction. Very good progress here, really the major thing about C7 (alongside the new UI).

EMC Migration Appliance – a consulting project aimed to move quickly data between old style repositories and repositories used through D2/xCP… Obviously doesn’t move the UI customizations (wg from Webtop).

Syncplicity – a generic connector (Server Synch Agent) between DMS and Syncplicity. Generic, because it can be extended for other repositories/systems.

NGIS – Finally, some info! Work is being done very actively (as I understood). multi-tennancy, base cost per object almost 0, no downtime on upgrading, add servers on the fly, no single point of failure, advanced smart containers, no SQL database (my addition, not on the slide), advanced RBAC and ACL’s side by side base don XACML, ditributed query execution…. and many many more (I have a picture, but my BB can’t upload it… sorry)

CMIS – not on the slides. but I went out and asked about it. Jeroen gave a politically correct answer. It’s clear that since CMIS does not have too much traction in the ECM space… this is not a high priority. It’s a vicious circle. Would have been very nice to see it 1:1 with the REST revolution….

For a very detailed conference live report… look at #mmtm12 on Twitter. Sevral people are doing a great job live tweeting everything.