Artem Chubaryan
It ain't much but it's honest work

It ain't much but it's honest work

Debugging Story: The Case of Rogue Reflection

Generics + Reflection + Type Inference = 🤯

Artem Chubaryan's photo
Artem Chubaryan
·Jul 1, 2021·

7 min read

Intro

Square is known for its open-source libraries and you've probably read about the challenges of building and supporting them.

A lesser known fact is that we also build a lot of SDKs that let developers use Square's platform to create amazing apps for sellers. I want to talk about a particular issue I was debugging recently in one of our Android SDKs - Reader SDK.

Ok, without further ado let's dive in.

The report

Two days ago I got a message from one of our developers who uses the Reader SDK. They said that the latest version of the SDK was crashing when pairing a Bluetooth reader.

image.png

Caused by: kotlin.reflect.jvm.internal.KotlinReflectionInternalError: Cannot create type for an unsupported classifier: T (class kotlin.jvm.internal.TypeParameterReference)
        at kotlin.reflect.full.KClassifiers.createType(KClassifiers.kt:45)
        at kotlin.reflect.jvm.internal.ReflectionFactoryImpl.typeOf(ReflectionFactoryImpl.java:123)
        at kotlin.jvm.internal.Reflection.typeOf(Reflection.java:122)
        at com.squareup.cardreaders.BleBackendWorkflow.onWorkerNoop(BleBackendWorkflow.kt:162)

My first reaction "Yay, it's not a Bluetooth issue!" was quickly replaced with "Oh no, we messed up the type inference somewhere!". This is going to be fun to debug...

The bug

Let’s start by enjoying the luxury of having a clear stack trace. It has a few lines of Kotlin internals that are followed by a call from “our” method - BleBackendWorkflow.onWorkerNoop(). Its implementation and call site are pretty straightforward:

private fun <T> RenderContext.onWorkerNoop(
  worker: Worker<T>,
  key: String,
  handler: (T) -> Unit
) = runningWorker(worker, key) {
  handler.invoke(it)
  noAction()
}
...
val bytes: Observable<ByteArray> = stateMachine.events.data
context.onWorkerNoop(bytes.asWorker(), "") { ... }

And here are the runningWorker and asWorker implementations:

public inline fun <T, reified W : Worker<T>, PropsT, StateT, OutputT>
  BaseRenderContext<PropsT, StateT, OutputT>.runningWorker(
  worker: W,
  key: String = "",
  noinline handler: (T) -> WorkflowAction<PropsT, StateT, OutputT>
) {
  runningWorker(worker, typeOf<W>(), key, handler)
}

public inline fun <reified T : Any> Observable<out T?>.asWorker(): Worker<T> =
  this.toFlowable(BUFFER).asWorker()

I don't see any problems with the onWorkerNoop method. Android Studio is not complaining, unit tests pass, "works on my machine". Time to ask for some help!

image.png

Luckily for me, Aidan and Eric came for rescue and quickly found the problem in onWorkerNoop: T is not reified and the function should be inline, otherwise type erasure makes T as good as Any.

Yay! Oh wait... How on Earth does it compile and run?

image.png

The reflection upon reflection

Fun fact about reflection in Kotlin on JVM, is that it is present in two different components: kotlin-reflect and kotlin-stdlib.

On the JVM platform, the runtime component required for using the reflection features is distributed as a separate artifact kotlin-reflect.jar in the Kotlin compiler distribution. This is done to reduce the required size of the runtime library for applications that do not use reflection features.

On the JVM platform, the standard library contains extensions for reflection classes that provide a mapping to and from Java reflection objects (see package kotlin.reflect.jvm).

As a result, if kotlin-reflect is present in your runtime, its ReflectionFactoryImpl will be used instead of stdlib's ReflectionFactory

Instantiation of the factory happens during runtime in Reflection.java (yup, .java!)

ReflectionFactory impl;
try {
    Class<?> implClass = Class.forName("kotlin.reflect.jvm.internal.ReflectionFactoryImpl");
    impl = (ReflectionFactory) implClass.newInstance();
}
...
factory = impl != null ? impl : new ReflectionFactory();

Ok, so now we know that typeOf() has two different implementations and based on the stack trace, the app was using the one from kotlin-reflect. So what, the Reader SDK doesn’t depend on kotlin-reflect anyway? How does it help us understand the problem?

Well, here's the thing. While, the Reader SDK does NOT depend on the kotlin-reflect, the latter IS definitely being brought in by some other library transitively.

Transitive dependencies... diamond dependencies... we've bumped Kotlin to 1.4 in this release... type inference changes happened around 1.4…

image.png

Is it a non-matching lib version issue?

The culprit

The developer confirmed they're using Kotlin 1.4.10. Reader SDK uses 1.4.32, so we are overriding the older version during a dependency resolution. This means that kotlin-stdlib is also 1.4.32 (or even newer).

But we're not the ones bringing in kotlin-reflect, it's coming from somewhere else. Let's check the dependencies list and find all the kotlin things:

implementation "com.squareup.moshi:moshi-kotlin:1.9.3"
implementation "org.jetbrains.kotlin:kotlin-stdlib-jdk7:1.4.10"
implementation "androidx.room:room-ktx:2.3.0"
implementation 'androidx.navigation:navigation-fragment-ktx:2.3.5'
implementation 'androidx.navigation:navigation-ui-ktx:2.3.5'

moshi-kotlin... My good old friend. Oh, the memories of getting rid of you and your transitive kotlin-reflect dependency from our SDK. You came back to make my day more interesting.

Let's see which libraries does moshi-kotlin:1.9.3 bring along to the app.

image.png

Bingo! We've got kotlin-reflect:1.3.50 running along the kotlin-stdlib:1.4.32!

The difference

Once the difference in minor version numbers is found, everything starts making sense. Running unit tests with the kotlin-reflect:1.3.50 quickly leads to a consistent crash. Attaching a debugger, we can find the problem in our call to createType:

val descriptor = (this as? KClassifierImpl)?.descriptor
                     ?: throw KotlinReflectionInternalError("Cannot create type for an unsupported classifier: $this (${this.javaClass})")

this is TypeParameterReference with name="T" instead of being a KTypeParameterImpl for T. That’s not good, it means that code took different paths somewhere along the way. Putting a breakpoint on constructors on both of those classes will make this difference obvious.

1.3.50 image.png

1.4.0 image.png

See how Reflection.typeParameter:174 is same in both stack traces, but next method - typeParameter - is called from ReflectionFactory in 1.3.50, but in 1.4.0 it's now in ReflectionFactoryImpl.

My first guess would be that this is an override of a method. Comparing code between 1.3.50 and 1.4.0 confirms this theory - typeParameter method was overridden in 1.4.0 and now is "smarter" about the type of the parameter.

Fine, the behavior got smarter in new version. But what about backward compatibility? Why doesn't ReflectionFactory.typeParameter() implementation from kotlin-stdlib:1.4.0 work with kotlin-reflect:1.3.50?

The answer is pretty simple. Take a look at ReflectionFactory.typeParameter():

@SinceKotlin(version = "1.4")
public KTypeParameter typeParameter(Object container, String name, KVariance variance, boolean isReified) {
    return new TypeParameterReference(container, name, variance, isReified);
}

Here @SinceKotlin(version = "1.4") points at the right direction. The method doesn't exist in ReflectionFactory 1.3.50! Furthermore TypeParameterReference was also added to kotlin-stdlib in 1.4.0, so there is very little chance that kotlin-reflect:1.3.50 would know how to handle it!

At this point we've got enough evidence that kotlin-stdlib:1.4.0 ain't going to work with kotlin-reflect:1.3.50. So all we need to do is bump the kotlin-reflect version. It's as simple as updating its "parent" dependency - moshi-kotlin - to the latest (as of writing) version 1.12.0. Developer confirmed this resolved the issue.

Case closed?

The actual fix

Not yet. onWorkerNoop should still be updated to capture the type properly by having T marked as reified and inlining the whole method. Without this change, T won't get inferred correctly and we will be losing information about the caller worker's type.

private inline fun <reified T> RenderContext.onWorkerNoop(
  worker: Worker<T>,
  key: String,
  crossinline handler: (T) -> Unit
) = runningWorker(worker, key) {
  handler.invoke(it)
  noAction()
}

Now once we hit the breakpoint inside KClassifier.createType we can see that our classifier is aKClassImpl for ByteArray instead of a vague KTypeParameterImpl for T.

image.png ⬇️ image.png

Finally, we're done! Worker<T> became a Worker<ByteArray>, which means that we are inferring the types inside onWorkerNoop properly. This fix goes into the next release.

The afterword

As I was writing this blog post I suddenly realized that I had the unit tests around the aforementioned code fail when I refactored it into a onWorkerNoop this spring (of course it was me who introduced this bug). But back then I didn't understand the failure and wrote it off as another Workflow quirk. Here's what happened:

Workflow uses the type of Worker to distinguish different workers. If workers have the same type, one should provide a unique key. I was creating three workers via onWorkerNoop, all with different types, yet because of the missing reified marker they all ended up the same - Worker<T>. The error said exactly that:

Caused by: java.lang.IllegalArgumentException: Expected keys to be unique for worker com.squareup.workflow1.Worker<T>: key=""
    at com.squareup.workflow1.internal.SubtreeManager.render(SubtreeManager.kt:125)
    at com.squareup.workflow1.internal.RealRenderContext.renderChild(RealRenderContext.kt:61)

I failed to figure it out back then, and just added unique keys to each of the workers. I ignored this red flag and it ended up causing a crash in release. I wish I could say that this won't happen again, but it probably will. However, I'll try to pay more attention to seemingly insignificant failures.

There’s still an open question remaining: why does IDE allow signature reified W : Worker<T> without forcing T to also be reified? typeOf<W>() should be requiring both W and T to be reified IMO. Let me know if you have an answer to this!

That's all for today, see you later!

 
Share this