Data Persistence Dilemma

Sometimes, in order to solve a problem, I have to think through it out loud (or in this case, in writing).

Here’s the sitch: I’m making an app for knitters and crocheters. They need to be able to manage projects (i.e. “Baby blanket,” “Socks for mom,” etc.). In addition to a bunch of metadata about each project, users should be able to add photos and designate one or more images or PDF files as the project’s pattern. A PDF or image of the pattern isn’t required, but including one will allow users to enter a split view where they can view the pattern and operate a row counter at the same time.

A project shouldn’t necessarily “own” its pattern. In other words, a pattern can have multiple projects associated with it (say you want to make the same baby blanket for multiple babies), so as to avoid the needless duplication of the pattern file. A pattern can exist without a project and a project can exist without a pattern, but when linked, the project becomes the child of the pattern.

My user base includes people who may not always have an internet connection. Therefore, all data needs to be stored locally. However, those who do have an internet connection are going to want iCloud sync between devices.

I like Core Data. If I were to set this up in Core Data, without any consideration of iCloud syncing, I’d create Project and Pattern entities, store images and PDFs in the file system, and call it a day.

iCloud syncing is where things get murky for me. Core Data + iCloud is deprecated, and I don’t want to use it. Not only that, I don’t know what to do with the PDFs and images. Storing them as BLOBs in Core Data seems like a bad idea. I understand how to save them to the file system but don’t understand how to sync them via iCloud and also have a reference to them in Core Data. Do I use iCloud Document Storage for them? Do I zip them up somehow (NSFileWrapper??) and use UIDocument? How do I store a reference to them in Core Data (just the file name of the UIDocument, since the file URL is variable?). If users will be adding photos and PDFs at different times, do I use one UIDocument subclass for photos and one for PDFs or do I use a single document and update it with the added information? You can tell I obviously have no idea how this works, and a multitude of Google searches has yet to clear it up for me.

As for the rest of the information in Core Data, I’m thinking of trying to sync it  with CloudKit using something like the open source version of Ensembles or Seam3.

I guess I’m not sure if I’m on the right track and would welcome any advice/feedback. I’d really like to stay away from non-Apple services (like Realm) for the time being. Comments are open!

7 thoughts on “Data Persistence Dilemma

  1. Hi Becky, the app sounds like a nice idea (my wife could be a beta tester, she does a lot of knitting and crochet).

    Regarding the sync + local issue, this is very similar to the kind of thinking I went through when starting work on Findings (an app for scientists). I wanted a storage that works locally, and that also worked when used on Dropbox from multiple devices (or iCloud drive, or Google Drive, or etc), but in a way that would be completely transparent to the app: a file is a file is a file, and whatever Dropbox/iCloud/GoogleDrive is doing, it should just keep working. This is what led me to the ‘PARStore’ format (I gave a Blitz talk at NSConf 5, and see: https://mjtsai.com/blog/2014/05/21/findings-1-0-and-parstore/). A Findings library is now a collection of PARStore files, with a library manager layer on top that keeps track of modified files and builds a cache for indexing and for parts of the UI (see another Blitz talk at NSConf 6) . The foundation was quite a bit of work, but has been rock solid and this strategy brings a lot of flexibility to the way a document can be stored, shared, synced and merged.

    If I were to start an app like yours, I’d copy of lot of my Findings code, but I can see how the above might be a lot of work (and maybe overkill?). I’d stay away from CoreData sync, though, and look into Ensembles indeed.

    Charles

    • Becky H. says:

      Hi Charles!

      PARStore seems like a really well-designed syncing solution, but you’re right, setting up the foundation is probably a little more work than I’m ready for at this point. Still, it’s good to know that it’s out there. Findings is a beautiful app, by the way! Oh, and I’d love it if your wife could be a beta-tester when the app is at that stage. :) Thanks for commenting!

      Becky

  2. Becky, I think I’d get the local storage sorted out first. I like the idea of referring to a file, blobs seem icky to me too.

    You could write a “sync” mechanism to save data using CloudKit and keep local last saved and sync’d times so you can know when you need to push and/or pull data.

    Using iCloud for storing the free standing files sounds like a winner to me, especially if you can keep the names consistent.

    I’ve worked on a few solutions that do what you’re after but I’ve never used CloudKit, so I can’t give you any decent feedback there. Maybe someone else has experience worth sharing.

    Good luck. It sounds like a fun project.

    • Becky H. says:

      Thanks Rob! You’re right, I should focus on getting local storage working smoothly before adding a syncing solution! I’ve gotten a lot of good feedback though and I think I’m close to figuring out what will work best for my project. Thanks for responding!

  3. I like Core Data too. And I like CloudKit.

    Sounds like Project has a to-many relationship to Pattern, and a Pattern has a to-many relationship to Project. The Core Data delete rules will be to nullify in both directions.

    For your Pattern image files, you could create an ‘imageData’ attribute of type NSData and choose the ‘use external storage’ option. This tells Core Data to write your data as a blob to the database. The nice part is if the image is large (over ~150K) the data is written to disk as an optimization. Either way it’s transparent to you – you’ll always get and set an imageData property that is of type NSData.

    This is great approach – super simple. HOWEVER, I have run into performance issues during database migrations when you have a large number of these Pattern objects (with large imageData properties). Apparently during migration Core Data copies each of these files to a new location for the new store, and it is slow.

    If you think this could be the case for your users, then it would be better to save the image data to disk yourself, and store the path to the file as an ‘imagePath’ string attribute on Pattern. You’re doing the same thing Core Data does, but you lose a couple things – namely the atomicity of the save and deleting the file when the object is deleted.

    Since this is a brand new project, I would suggest letting Core Data handle it for you. You’ll be up and running faster. If it proves to be a problem one day, it’s not a huge problem to fix.

    Now, with regards to CloudKit, I think it’s fantastic. I’ve used it in four projects to sync with Core Data and couldn’t be happier. But there is no built-in syncing mechanism between Core Data and CloudKit. You have to replicate your Core Data model in CloudKit. So you’ll need methods to convert a Pattern object to a CKRecord, and vice versa.

    CloudKit has a really nice sync API – you give it a token and it will tell you about all additions, deletes, updates, etc. since the last sync. You update your core data objects from these records.

    And naturally, you tell CloudKit about all the changes on your side as well so other devices pick up the changes.

    I would suggest reading Kugler and Eggert’s fantastic book on Core Data (in swift) https://www.objc.io/books/core-data/ . They talk about an approach to syncing (using any cloud storage) that works very well. One of the authors worked at apple on converting the iOS Photos app to using Core Data, so they kind of know their stuff.

    • Becky H. says:

      Hi Jason,

      Wow, thanks for your thoughtful reply and for the Kugler & Eggert book recommendation. I bought it last night and started digging in—even if I don’t end up using their sync architecture, I’ve already come across so many great tips for making life easier with Core Data that it was definitely worth the price!

      It’s also good to know that large objects in a Core Data base potentially slow down migrations. I don’t think that’ll be a huge problem for me, but at least I’m aware of it now! Thanks again.

      Becky

Comments are closed.